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EXECUTIVE  SUMMARY 

This  report  examines  and  evaluates  three  leading  conventional 
decoding  algorithms  for  BCH  and  Reed-Sol omon  error-correcting  codes: 

•  the  decoding  algorithm  of  Sugiyama  et  al . ,  which  is  based  on 
Euclid's  algorithm 

•  a  decoding  algorithm  developed  by  Scholtz  and  Welch  based  on 
Mills'  continued  fraction  expansion 

•  the  Berlekamp-Massey  decoding  algorithm. 

The  three  algorithms  can  be  viewed  as  slightly  differing  variations 
of  Euclid's  algorithm  for  finding  the  greatest  common  divisor  of  two 
polynomials.  All,  in  appropriate  versions,  are  suitable  for  VLSI 
implementation  in  a  two-dimensional  array  for  pipelined  decoding  of 
received  codeword  polynomials  distorted  by  errors  and  erasures. 

Extension  of  the  classical  decoding  theory  for  BCH  codes 


The  classical  decoding  theory  for  t-error-correcting  BCH  codes 
as  developed  by  Peterson,  Gorenstein  and  Zierler,  Chien,  Forney,  and 
Berlekamp  is  centered  about  the  key  equation  _ 


q(x)  =  S( x ) \ ( x ) 


(mod  x^t) 
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relating  three  important  polynomials: 

•  the  known  syndrome  polynomial  S(x) 

•  the  unknown  error  locator  polynomial  \(x) 

•  the  unknown  error  evaluator  polynomial  Mx) 

The  three  conventional  algorithms  under  study  solve  this 
equation  for  the  unknown  polynomials  A. { x )  and  Mx)  given  the  known 
syndrome  polynomial  S(x).  The  error  locations  can  then  be 
determined  by  a  Chien  search  for  the  zeros  of  A. ( x )  and  the  error 
magnitudes  can  be  calculated  directly  by  Forney's  formula 

MX"1) 
j  a'u:1) 

J 

where  Yj  is  the  jth  error  magnitude,  Xj  is  the  field  element 
denoting  the  jth  error  location,  and  \'(x)  is  the  formal  derivative 
of  the  error  locator  polynomial. 

We  have  rounded  out  the  classical  theory  by  defining  a  new 
polynomial  Mx)  such  that 

Mx)S(x)  =  x2t<\(x)  +  o(x). 

The  polynomial  A(x)  contains  the  same  information  as  the 

error  evaluator  polynomial  Q(x)  in  that  the  syndrome  polynomial  can 

be  recovered  either  from  the  pair  (Q(x),  Mx))  or  from  the  pair 

iv 


{ A( x ) ,  A( x ) ) .  This  leads  to  the  derivation  of  a  new  formula  for 
calculation  of  the  error  magnitudes  in  terms  of  A(x)  and  A(x).  This 
new  formula  (an  alternative  to  Forney's  formula)  can  be  used  if  A(x) 
is  easier  to  calculate  than  Q(x). 

Inside  Euclid's  algorithm 

Euclid's  famous  algorithm  for  finding  the  greatest  common 
divisor  of  two  integers  can  be  immediately  generalized  for  finding 
the  greatest  common  divisior  of  two  polynomials  f ( x )  and  g(x)  over  a 
given  field.  In  the  extended  version,  the  algorithm  also  yields 
polynomials  a { x)  and  b(x)  satisfying 

gcd(f (x) ,  g ( x ) )  =  a ( x ) f ( x)  +  b(x)g(x). 

This  form  of  the  algorithm,  with  suitable  modifications,  can  be  used 
to  solve  the  key  equation  to  produce  the  error  locator  polynomial 
A(x)  and  the  error  evaluator  polynomial  Q ( x )  (or  scalar  multiples 
yA(x)  and  yQ(x)  for  some  field  element  y),  given  the  syndrome 
polynomial  S(x).  Euclid's  algorithm  is  the  basis  both  for  the 
decoding  algorithm  of  Sugiyama,  et.  al .  and  for  the  decoding 
algorithm  based  on  Mills'  continued  fraction  expansion. 

Imbedded  within  Euclid's  algorithm  is  a  polynomial  division, 
itself  an  iterative  process,  which  must  be  performed  once  during 
each  iteration  of  the  algorithm.  To  implement  the  algorithm  in  a 
systolic  array,  it  is  desirable  to  break  the  polynomial  division 
down  into  its  component  sequence  of  partial  divisions,  where  each 
partial  division  consists  of  a  field  element  inversion,  a 
multiplication  of  a  polynomial  by  a  scalar,  and  a  polynomial 
subtraction . 
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We  have  looked  inside  Euclid's  algorithm  to  examine  the 
implications  of  this  replacement.  When  the  polynomial  divisions  are 
replaced  by  a  sequence  of  partial  divisions,  Euclid's  algorithm 
exhibits  a  two-loop  structure;  one  loop  is  executed  when  the  partial 
division  does  not  complete  a  polynomial  division,  and  the  other  loop 
is  executed  whenever  the  partial  division  does  complete  the 
polynomial  division.  (Both  loops  contain  common  steps.)  A  valid, 
cleaner,  and  more  efficient  algorithm  can  be  obtained  by  deleting 
one  of  the  loops,  with  suitable  modifications  to  the  remaining 
loop.  The  resulting  improved  algorithm  bears  a  striking  resemblance 
to  Berlekamp's  algorithm.  In  effect,  this  study  shows  why  the 
Berlekamp-Massey  decoding  algorithm  is  more  efficient  than  the 
decoding  algorithms  based  directly  on  Euclid's  algorithm. 

The  Berlekamp-Massey  algorithm  in  a  Euclidean  context 

Both  the  Berlekamp-Massey  algorithm  and  the  decoding  algorithms 
based  upon  Euclid's  algorithm  can  be  improved  by  adopting  features 
from  each  other.  The  chief  drawback  of  the  Berlekamp-Massey 
algorithm  when  implementated  in  a  systolic  array  is  the  need  to 
calculate  a  discrepancy  between  the  value  of  the  next  syndrome 
symbol  and  the  next  symbol  output  by  the  current  linear  feedback 
shift  register  (in  Massey's  formulation).  This  calculation  requires 
an  inner-product  computation  at  each  iteration  of  the  algorithm,  a 
computation  whose  length  increases  with  the  number  of  iterations. 

We  have  expanded  the  Berlekamp-Massey  algorithm,  employing 
additional  polynomials  including  a  remainder-like  polynomial  r(x) 
that  corresponds  to  the  remainder  polynomial  retained  in  the 
Euclidean  decoding  algorithms.  Retention  of  r(x)  obviates  the  need 
to  calculate  the  discrepancy  at  each  iteration,  fo»"  at  iteration  j 
the  jth  discrepancy  is  given  by  the  coefficient  rj.  Thus,  at  the 
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cost  of  additional  multiplications  and  storage,  expansion  of  the 
Berlekamp-Massey  algorithm  in  a  Euclidean  context  renders  the 
algorithm  suitable  for  VLSI  implementation  in  a  two-dimensional 
systolic  array. 

The  Hills* algorithm  in  a  Berlekamp-Massey  context 

Similarly,  the  decoding  algorithms  based  on  Euclid's  algorithm 
can  be  improved  by  modifications  that  move  them  closer  to 
Berlekamp's  algorithm.  The  polynomial  divisions  in  Mills'  decoding 
algorithm  are  replaced  by  a  sequence  of  partial  divisions,  again 
resulting  in  a  two-loop  structure.  One  loop  is  then  removed, 
yielding  an  improved  decoding  algorithm  based  on  our  enhanced 
version  of  Euclid's  algorithm.  We  have  confirmed  the  validity  of 
the  new  algorithm  by  demonstrating  that  the  partial  results 
generated  by  the  algorithm  can  be  mapped  by  scalar  multiplication 
into  partial  results  generated  by  its  predecessor.  The  new  version 
of  the  Mills'  algorithm  closely  resembles  the  Eucl ideanized  version 
of  the  Berlekamp-Massey  algorithm,  and  scalar  multiples  of  the 
partial  results  obtained  from  the  one  are  equated  to  the  partial 
results  obtained  from  the  other.  This  demonstrates  an  equivalence 
among  all  three  of  the  decoding  algorithms  studied. 

Inversionless  decoding  algorithms 

The  three  decoding  algorithms  under  study  and  the  variants  and 
hybrid  versions  constructed  therefrom  all  require  finite  field 
divisions  or,  equivalently,  inversion  of  finite  field  elements. 
These  requirements  can  be  removed,  at  the  cost  of  further  scalar 
multiplications,  by  Burton's  technique.  When  Burton's 
transformation  is  applied  to  the  Eucl ideanized  Berlekamp-Massey 
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algorithm  and  to  the  enhanced  version  of  the  Mills  algorithm,  the 
resulting  algorithms  are  identical,  except  for  an  algebraic  sign  in 
one  step. 

Decoding  with  erasures 

Forney  has  shown  that,  by  defining  a  (known)  erasure  locator 
polynomial  *(x)  analogous  to  the  error  locator  polynomial  'v(x)  and 
defining  a  modified  syndrome  polynomial  T(x)  by 


T(x)  =  ^(x)S(x) 


(mod  x^t) 


one  can  solve  the  key  equation  for  errors-and-erasures  decoding  of 
t-error-correcting  BCH  codes 


°(x)  =  M  x)T( x ) 


(mod  x^t) 


r(x)S(x) 


(mod  x^t) 


for  the  errata  evaluator  polynomial  °>{ x )  and  the  error  locator 
polynomial  \(x).  An  erratum  is  either  an  error  or  an  erasure.  The 
errata  locator  polynomial  n(x)  can  then  be  obtained  as 

n(x)  =  <(x)  a(x) . 

Forney's  formula  for  calculating  the  jth  erratum  magnitude  is 
rewritten  as 
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where  n‘(x)  is  the  formal  derivative  of  n(x)  and  Xj  is  the  field 
element  associated  with  the  jth  erratum  location.  Substitution  of 
T ( x )  for  S(x)  in  any  of  the  decoding  algorithms  yields  the 
appropriate  values  for  \(x)  and  Q{x). 

Blahut  has  shown,  however,  that  the  errata  locator  polynomial 
n(x)  can  be  obtained  directly  from  Berlekamp's  algorithm  if  the 
feedback  connection  polynomial  for  the  shi ft-regi ster  (in  Massey's 
formulation)  is  initialized  by  the  erasure  locator  polynomial  tc ( x ) 
in  place  of  the  polynomial  1.  By  combining  these  results,  we  derive 
decoding  algorithms  of  the  Berlekamp  type  and  of  the  Euclidean  type 
that  yield  both  the  errata  locator  polynomial  n(x.)  and  the  errata 
evaluator  polynomial  Q(x).  These  algorithms  obviate  the  usual  need 
to  calculate  n(x)  by  a  polynomial  multiplication  of  tc(x)  and  A(x). 

Conclusions 

This  study  has  demonstrated  that  versions  of  these  three 
BCH  decoding  algorithms  can  be  constructed  that 

•  allow  for  the  decoding  of  both  errors  and  erasures 

•  do  not  require  finite  field  inversions  or  divisions 

•  are  suitable  for  VLSI  implementation  in  a  two-dimensional 
systolic  array,  allowing  pipelining  of  the  received  codeword 
polynomial s. 

This  implementation  will  be  the  subject  of  further  study. 
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SECTION  1 


INTRODUCTION 


1.1  PURPOSE 

The  objective  of  Project  7560,  Error  Control  Codes  and 
Modulation  Techniques,  is  to  identify  and  develop  decoding 
algorithms  that  lead  to  architectures  well  suited  for  very  large 
scale  integrated  (VLSI)  circuit  technology  implementation.  The 
early  thrust  of  the  FY86  activities  was  to  develop  an  awareness  of 
current  research  in  convolutional  coding,  signal-space  coding,  and 
algebraic  block  coding.  We  concluded  that  algebraic  block  coding 
held  the  most  unfulfilled  potential  for  VLSI  implementation,  and 
then  focused  project  activities  in  this  area. 

Three  leading  algorithms  for  the  decoding  of  Bose-Chaudhuri- 
Hocquenghem  (BCH)  codes  and  Reed-Sol omon  codes  were  selected  for 
study  and  comparison.  These  were  the  decoding  algorithm  of 
Sugiyama,  Kasahara,  Hirasawa,  and  Namekawa;  Mills'  continued- 
fraction  algorithm;  and  the  Berlekamp-Massey  algorithm.  Hybrid 
versions  of  the  algorithms  were  developed,  and  enhancements  that 
improve  the  efficiency  of  the  algorithms  or  their  suitability  for 
VLSI  implementation  were  proposed.  This  report  documents  the 
results  of  our  investigation. 

The  report  is  intended  to  provide  the  circuit  designer  with  a 
thorough  understanding  of  the  decoding  algorithms.  The  approach 
taken  is  a  descriptive  rather  than  a  formal  mathematical  one.  Each 
algorithm  is  described  concisely  and  precisely  using  Iverson's 
programming  language  (APL).  Each  algorithm  is  illustrated  by  a 
decoding  example.  Formal  theorems  and  proofs  are  avoided  throughout 


the  report,  but  the  attempt  is  made  to  show  how  and  why  each 
algorithm  works.  Estimates  are  made  of  the  complexity  (number  of 
scalar  multiplications)  and  throughput  (number  of  cycle  times  or 
systolic  array  levels)  associated  with  each  algorithm. 

We  assume  that  the  reader  has  some  knowledge  of  the  basics  of 
algebraic  coding  theory.  The  design  and  structure  of  BCH  codes  are 
not  discussed.  Encoding  is  not  discussed.  Decoding  is  discussed  in 
great  detail,  but  with  emphasis  placed  on  solution  of  the  so-called 
key  equation.  Each  of  the  three  algorithms  solves  the  key  equation 
for  the  error  locator  and  error  evaluator  polynomials,  given  the 
syndrome  polynomial  derived  from  the  received  codeword  polynomial. 
This  is  the  critical  section  of  an  algebraic  decoder. 

1.2  BACKGROUND 

The  advent  of  BCH  codes  [1-3]  gave  birth  to  a  flurry  of 
activity  in  the  design  of  algebraic  decoders.  Peterson  [4,5] 
developed  the  fundamental  algorithm,  based  upon  inversion  of  finite 
field  matrices,  for  decoding  binary  BCH  codes.  Gorenstein  and 
Zierler  [6]  extended  Peterson's  algorithm  to  nonbinary  BCH  codes, 
noting  that  the  codes  of  Reed  and  Solomon  [7]  form  a  special  case  of 
nonbinary  BCH  codes.  Chien  [8]  proposed  a  search  method  for 
deriving  the  error  locations  from  the  error  locator  polynomial. 
Forney  [9]  gave  a  direct  formula  for  calculating  the  error 
magnitudes  from  the  error  evaluator  polynomial  and  the  formal 
derivative  of  the  error  locator  polynomial  and  also  developed  a 
method  for  decoding  erasures. 

Berlekamp  [10]  developed  an  efficient  algorithm  for  determining 
the  error  locator  and  error  evaluator  polynomials.  Massey  [11] 
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elucidated  Berlekamp's  algorithm  by  deriving  it  as  a  method  for 
synthesizing  the  shortest  linear  feedback  shift  register  (LFSR)  that 
will  generate  a  given  sequence.  The  algorithm  now  is  frequently 
called  the  Berlekamp-Massey  algorithm. 

Sugiyama,  Kasahara,  Hirasawa,  and  Namekawa  [12]  employed 
Euclid's  algorithm  to  solve  the  key  equation  for  the  error  locator 
polynomial  and  the  error  evaluator  polynomial.  Mills  [13] 
gave  a  continued-fraction  algorithm  for  finding  linear  recurrences. 
Mills'  algorithm  is  in  essence  the  same  as  the  algorithm  of 
Sugiyama  et  al .  Welch  and  Scholtz  [14]  showed  an  equivalence 
between  Mills'  algorithm  and  the  Berlekamp-Massey  algorithm. 

We  regard  these  last  three  algorithms  as  variants  of  Euclid's 
algorithm.  Viewing  the  Berlekamp-Massey  algorithm  in  a  Euclidean 
framework,  for  instance,  provides  a  deeper  insight  into  its 
workings,  leading  to  computational  simplification  and  to  the 
elicitation  of  further  information  from  its  employment. 

1.3  SCOPE 

This  report  contains  a  detailed  examination  of  these  three 
algebraic  decoding  algorithms  proposed  for  the  decoding  of  BCH 
error-correcting  codes.  We  compare  the  suitability  of  the 
algorithms  for  VLSI  implementation.  All  three  algorithms  are  viewed 
essentially  as  variants  of  Euclid's  extended  algorithm  for 
polynomials.  Several  versions,  including  hybrids,  of  these 
algorithms  are  developed  and  compared.  Each  version  is  represented 


in  the  form  of  a  program.  This  provides  both  concision  and 
precision  in  the  description  of  each  algorithm,  highlighting  the 
similarities  and  differences  between  different  variants. 


Section  2  of  the  report  reviews  Euclid's  algorithm.  Section  3 
consists  of  two  parts:  a  brief  review  of  the  decoding  problem  for 
BCH  codes  and  of  the  classical  decoding  algorithms  as  developed  by 
Peterson,  Gorenstein  and  Zierler,  Chien,  and  Forney;  and  an 
extension  of  the  classical  development,  providing  an  alternative  to 
Forney's  formula  for  calculating  the  error  magnitudes. 

Sections  4,  5,  and  6  contain  reviews  of  the  three  algebraic 
decoding  algorithms  under  consideration.  Section  4  examines  the 
algorithm  of  Sugiyama,  Kasahara,  Hirasawa,  and  Namekawa.  Section  5 
explores  the  relationship  between  Euclid's  algorithm  and  Mills' 
continued-fraction  expansion.  The  decoding  algorithm  obtained  by 
Scholtz  and  Welch  from  Mills'  algorithm  is  examined  and  shown  to  be 
essentially  the  same  as  the  algorithm  of  Sugiyama  et  al .  Section  6 
reviews  the  decoding  algorithm  invented  by  Berlekamp  and  rederived 
by  Massey  in  the  context  of  LFSR  synthesis. 

Section  7  contains  the  main  results  of  the  report  and  is 
divided  into  five  parts.  In  the  first  part,  the  Berlekamp-Massey 
algorithm  is  expanded  in  a  Euclidean  context  by  the  calculation  of 
additional  polynomials  analogous  to  those  computed  in  the  extended 
Euclid's  algorithm.  The  resulting  algorithm  is  more  efficient  for 
VLSI  implementation  because  the  inner-product  calculation  of 
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discrepancies  is  obviated.  Section  7.2  examines  a  new  decoding 
algorithm  proposed  by  Citron,  and  shows  that  it  belongs  to  the  class 
of  Euclldeanized  Berlekamp-Massey  algorithms  developed  in  section 
7.1.  Euclid's  algorithm  for  polynomials  is  dissected  in  section 
7.3.  The  polynomial  divisions  inherent  in  the  algorithm  are  first 
separated  into  their  component  partial  divisions.  The  resulting 
algorithm  is  then  modified  and  rearranged  into  a  cleaner,  more 
efficient  version  of  Euclid's  algorithm.  In  section  7.4  this  same 
dissection  and  modification  are  applied  to  Mills'  decoding 
algorithm.  The  resulting  algorithm,  incorporating  the  more 
efficient  version  of  Euclid's  algorithm,  is  more  efficient  for 
decoding  and  closely  parallels  the  Eucl ideanized  versions  of  the 
Berlekamp-Massey  algorithm.  An  equivalence  between  Mills'  algorithm 
and  the  Berlekamp-Massey  algorithm  is  then  established  by 
demonstrating  that  the  partial  results  obtained  by  the  various 
versions  can  be  mapped  into  one  another  by  multiplication  by 
suitable  scalar  factors.  Section  7.5  summarizes  the  similarities 
and  differences  among  the  several  algorithms  and  their  variants. 


In  section  8,  the  decoding  algorithms  are  modified,  using  a 
technique  developed  by  Burton,  to  avoid  all  finite  field  division  or 
inversion.  The  resulting  versions  of  Mills'  algorithm  and  the 
Berlekamp-Massey  algorithm  are  seen  to  be  nearly  identical.  In 
section  9,  results  are  extended  to  include  the  decoding  of  erasures 
as  well  as  errors.  Results  of  Forney  and  of  Blahut  can  be  combined 
to  give  decoding  algorithms  that  directly  furnish  both  the  errata 
locator  polynomial  and  the  errata  evaluator  polynomial.  Section  10 
summarizes  the  conclusions  of  the  study. 
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SECTION  2 

EUCLID'S  ALGORITHM 

In  Book  VII,  Proposition  2  of  his  Elements  [15],  Euclid  gave 
his  famous  algorithm  for  finding  the  greatest  common  divisor 
gcd  (s,t)  of  two  Integers  s  and  t. 

Let  s  >  t  >  0.  Dividing  s  by  t, 

s  =  qi t  +  rl ,  (0  <_  rx  <  t) 

and,  if  rx  >  0,  the  gcd  (s,t)  must  also  divide  the  remainder  r: . 
Continuing  the  process  if  rL  ^  0, 

t  =  q2r2  +  r2,  (0  _<  r2  <  r2). 


rk  =  V2V1  *  rk*2-  (0  1  rk+2  <  W 


until  finally,  for  some  n,  rn  must  equal  0: 


r„  9  =  q„r  .  +  0. 
n-2  n  n-1 


Since  rn.i  divides  rn_2.  it  must  divide  rn„3  =  0n-lrn-2  + 
rn_i,  and,  similarly,  all  r\,  back  to  r0  =  t  and  r_x  *  s.  Thus, 
rn_j  is  a  common  divisor  of  s  and  t,  and  since  the  gcd  (s,t)  must 
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divide  each  nonzero  remainder  r^,  rn_i  must  be  the  greatest 
common  divisor  of  s  and  t.  The  recursion  of  Euclid's  algorithm  is 
given  by 


rk-2  "  qkrk-l* 


A  simple  program  for  Euclid's  algorithm  is  shown  in  program  1. 
The  programming  notation  is  loosely  adapted  from  Iverson  [16].  The 
back-arrow  can  be  read  as  "is  specified  by,"  the  colon  as  "is 
compared  to."  For  each  specification  statement,  the  quantity  to  the 
left  of  the  arrow  is  replaced  by  the  quantity  to  the  right.  For 
comparison  statements,  branches  leaving  the  statement  at  either 
side  are  followed  if,  and  only  if,  the  relation  represented  by  the 
branch  label  is  satisfied  when  substituted  for  the  colon  in  the 
comparison  statement.  A  branch  with  no  label  is  always  followed. 


The  notation  I a/b  ]  denotes  the  integer  part  or  quotient  of  the 
division  of  a  by  b.  r°  and  rN  represent  the  "old"  and  "new" 
values  of  the  remainder  rj<  at  iteration  k.  At  the  beginning  of 
the  kth  iteration,  r°  is  r^_2  and  rN  is  r^.j;  at  the  end 
of  the  kth  iteration,  r°  is  r^.i  and  rN  is  r^. 


Example  1:  s  =  9022,  t  =  4719. 


Initialization: 


rO  <-  9022 
rN  <-  4719 


Iteration  1: 


q  -  1 
y  <-  4303 
r0  <-  4719 
rN  <-  4303 
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Iteration  2: 


q  <-  1 
y  «-  416 
r0  4303 
rN  <-  416 

Iteration  3:  q  <-  10 

y  <-  143 
r0  <-  416 
rN  -  143 

Iteration  4:  q  <-  2 

y  <-  130 
rO  v  143 
r^  -  130 

Iteration  5:  q  «-  1 

y  <-  13 
rO  <-  130 
r^  <-  13 

Iteration  6:  q  <-  10 

y  <-  0 
rO  *  13 
rN  «-  0 


At  termination  of  the  example,  when  rN  =  0,  the  gcd 
(9022,4719)  is  found  to  be  13.  Tracing  through  all  the  remainders 
r°  and  rN  for  the  six  iterations  yields 


gcd(9022,4719)  -  gcd  (4719,4303) 
=  gcd  (4303,416) 

=  gcd  (416,143) 

«  gcd  (143,130) 

=  gcd  (130,13) 

=  gcd  (13,0) 

=  13 


Euclid's  algorithm  also  will  provide  integers  a  and  b  which 
satisfy  the  equation 


The  recursion  (3)  for  a^  and  can  be  written  In  matrix 
form  as 


ak  ak-l 


bk  bk-l 


ak-l  ak-2 


bk-l  bk-2 


ak-2  ak-3 


bk-2  bk-3 


1  0 


1  0 


-qk  1 


1  0 


0  1  -qi  1  -q2  1 


1  0 


1  0 


1  0 


Taking  determinants  on  both  sides, 


akbk-l  ‘  bkak-l  =  ('1} 


which  gives  the  useful  result  that  gcd(a|c,b|c)  =  1  for  all  k. 
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Euclid's  algorithm  can  be  immediately  extended  to  polynomials 
and,  in  particular,  to  polynomials  over  a  finite  field  GF(q).  The 
greatest  common  divisor  of  two  nonzero  polynomials  f ( x)  and  g(x)  is 
defined  as  the  monic  polynomial  of  greatest  degree  that  divides  both 
f(x)  and  g(x).  For  every  pair  of  polynomials  f(x)  and  g(x),  with 
g(x)  *  0,  there  exists  a  unique  pair  of  polynomials  q(x)  and  r(x) 
such  that 

f ( x )  =  q( x)g(x)  +  r(x)  (6) 

and  deg ( r ( x ) )  <  deg  (g(x)).  However,  r(x)  is  not  necessarily 
monic.  At  the  termination  of  Euclid's  algorithm  (program  3), 


P«gcd(f(x),g(x))  =  r°(x)  =  a°(x)f(x)  +  b°(x)g(x) 


where  p  =  r o  e  GF(q)  is  a  field  element. 

Program  3  is  identical  to  program  2,  except  that  polynomials 
have  replaced  integers  in  all  program  statements.  The  notation 
f ( x )/g( x )  represents  the  rational  function  or  infinite  power  series 
in  x  obtained  when  the  polynomial  f(x)  is  divided  by  the  polynomial 
g(x).  The  notation  [f(x)/g(x) j  denotes  the  quotient  polynomial  q(x) 
of  (6),  i.e.,  the  polynomial  or  "integer"  part  of  the  division  of 
f ( x )  by  g(x).  The  notation  f ( x )  | g( x)  be  use<1  represent 
the  remainder  polynomial  r(x  . 


»,*.  •*.  V- 
•J \  K. \  K 

*w  %vy*> 

i 


v„\£ 


The  recursion  in  program  3  is  directly  analogous  to  that  of 
program  2.  An  analogy  can  also  be  drawn  to  equations  (4),  and  (5) 
for  polynomials  ak(x),  bk(x)  and  rk(x): 

ak(x)f(x)  +  bk (x)g(x)  =  rk(x)  (7) 

ak{x)bk_1(x)  -  bk{ x )ak_1 ( x)  =  (-l)k+1  (8) 

Equation  (8)  implies  that  gcd(ak(x) ,bk(x) )  =  8,  a  field  element. 

Example  2:  f(x)  =  x5  +  3x4  +  3x2  +  5x  +  10 
g(x)  =  2x2  +  7x  +  3 

over  GF(ll). 

f(x)/g(x)  =  6x3  +  8x2  +  7x  +  9  +  10x_1  +  6x-2  +  8x-3  +  7x_l4  +  2x-5 

+  10x-6  +  6x-7  +  8x-8  +  7x-9  +  2x-10  +  ... 

[f(x)/g(x)J  =  6x3  +  8x2  +  7x  +  9  =  qMx) 

|f(x)|g(x)  =  9x  +  5  =  r1 (x) 

Results  of  applying  program  3  are  presented  in  the  following  table: 
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p*gcd(f(x),g(x))  =  9x  +  5 

=  f(x)  +  (5x3  +  3x2  +  4x  +  2)g(x) 

8  =  9 

gcd(f(x),g(x))  =  x  +  3 

Program  3  is  adapted  in  sections  4  and  5  for  solving  the  key 
equation  for  BCH  decoding.  First,  a  brief  review  of  the  decoding 
problem  is  presented  in  section  3. 
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SECTION  3 

THE  DECODING  PROBLEM 

In  this  section,  based  in  part  on  material  from  Peterson  [5] 
and  Blahut  [17],  the  decoding  problem  for  BCH  codes  arid  the  key 
equation  are  briefly  reviewed.  In  subsequent  sections  various 
algorithms  which  have  been  proposed  for  solving  the  key  equation  and 
which  are  shown  to  be  variants  of  Euclid's  algorithm  are  presented. 
This  section  presupposes  some  familiarity  with  algebraic  coding 
theory  on  the  part  of  the  reader.  Encoding  is  not  discussed,  and 
knowledge  of  the  structure  and  construction  of  generalized  BCH  and 
Reed-Sol omon  codes  is  assumed.  The  reader  should  refer  to  any  of 
several  excellent  standard  texts  on  algebraic  coding  theory,  e.g., 
[10],  [17-20],  for  further  background  material. 

This  section  is  divided  into  two  parts.  In  section  3.1  the 
classical  theory  as  developed  by  Peterson  [4],  Gorenstein  and 
Zierler  [6],  and  Forney  [9]  is  presented.  In  section  3.2  a  slightly 
different  view  is  taken,  and  an  alternative  to  Forney's  formula  for 
the  error  magnitudes  is  developed.  The  new  formula  is  important  for 
completion  of  the  classical  theory  rather  than  for  any  computational 
advantage. 

3.1  THE  CLASSICAL  BCH  DECODING  THEORY 
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Assume  a  BCH  code  designed  to  correct  t  errors  in  a  codeword  of 
length  n,  where  n  is  the  order  of  the  element  a  of  GF(qm),  the 
finite  field  of  q^  elements,  used  in  defining  the  code,  q  is  a 
power  of  a  prime,  and  m  is  a  given  integer.  If  a  is  a  primitive 
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element,  n  =  qm  -  1.  Throughout  this  report  we  assume  that  the 
code  is  generated  by  a  generator  polynomial  g(x)  defined  by 


g ( x )  =  1cm  ( f ! { X ) ,  f 2 ( x ) ,  f2t(x)) 


where  fj(x)  is  the  minimal  polynomial  of  ftJ  and  lcr,i  denotes  the 
least  common  multiple.  Let  c(x)  represent  the  transmitted  codeword 
polynomial,  e(x)  be  an  error  polynomial,  and  v(x)  =  c ( x)  +  e(x)  be 
the  received  codeword  polynomial.  Define  2t  error  syndromes  Sj  by 

Sj  =  v(aO)  =  c(kJ)  +  e(cJ)  =  e  ( a  j )  (j  =  1 . 2t) 

where  the  (j  =1,  ....  2t)  are  roots  of  the  code  generator 
polynomial  g(x). 

Suppose  v  (v  t)  errors  have  occurred  during  transmission. 
Define  v  unknown  error  locations  X5,  where  X?  is  the  field 
element  of  GF(qm)  associated  with  the  error  location  in  the 
codeword  (numbered  in  ascending  order  of  the  error  location 
numbers),  and  v  unknown  (unless  q  =  2)  error  magnitudes  Yj,  where 
In  i  0  and  Y 9  e  GF(q).  For  example,  if  e(x)  =  6x9  +  5x8  +  3x3, 
then  v  =  3,  Xx  =  <x3,  X2  =  n8,  X3  =  ft9,  Yj  =  3,  Y2  =  5,  Y3  =  6. 


The  2t  syndromes  are  given  by  the  2t  BCH  decoding  equations 
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The  decoding  problem  for  BCH  codes  is  to  solve  this  set  of  2t 
(nonlinear  simultaneous)  equations  for  the  v  unknown  error  locations 
X*  and  the  v  unknown  (if  q  >  2)  error  magnitudes  Y*,  given  the 
2t  syndromes  Sj. 

To  assist  in  finding  a  solution,  the  error  locator  polynomial 
is  defined  as  the  monic  polynomial  having  zeros  at  the  inverse  error 
locations  Xj1  for  l  =  1,  v: 


V  V 

\(x)  =  F  (1  -  X$x)  =  1  +  \  A x 1 

i=l  i=l 


(If  u  -  C,  a(x)  is  defined  as  the  zero-degree  polynomial  1.) 

Next,  multiply  (10)  by  Y*xi+V  and  set  x  =  X*1  For  each  j  and 
each  i  we  then  have  an  equation 


0  =  Y*xi+V(l  +  l  AiXj1 ) 
i=l 


=  Y,(xr+  i  Mxi^-1) 

i=l 


Summing  over  l  from  1  to  v,  for  each  j 


0  =  T  Y txfv  *  !  *1  '  V,„X^+V-1 

»=1  i=l  *=1 


Sj+v  +  ^  ^ i S j+v-i 

i=l 
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(see,  e.g.,  Hamming  [21],  pp.  233-4).  Since  *  X|<  for  l  *  k, 

V  t  0.  Since  X?  and  are  nonzero  for  z  =  1,  v, 

D  =*0.  Hence,  js  *  0,  and  the  system  of  equations  (12)  can  be 
solved  by  inverting  the  matrix  of  syndromes  S.  This  forms  the  basis 
for  Peterson's  decoding  algorithm  [4,5]. 

There  is  one  remaining  problem.  The  actual  number  of  errors  v 
is  unknown.  Peterson's  algorithm,  therefore,  initializes  v  as  t  and 
evaluates  the  determinant  of  the  t  *  t  syndrome  matrix  S.  If 

|S|  *  0,  then  K  is  obtained  as  $-1.  (-St+i,  -St+2»  •••. 

-S2t)T;  if  | s |  =  0,  V  is  reduced  by  1  and  S  is  redefined  by 
deletion  of  the  last  row  and  column  as  a  v  *  v  matrix.  Eventually, 
unless  all  Sj  =  0  (implying  no  transmission  errors  have  occurred). 
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for  some  value  of  v  we  have  |s|  *  0,  and  A  is  obtained  as  (av, 

• ...  A1 )T  =  S-1  •  (-Sv+i,  ....  -S£V)T. 

From  Peterson's  decoding  algorithm  we  know  that  for  a 
t-error-correcting  BCH  code,  so  long  as  v  <  t  errors  occur  in  the 
transmission  of  a  codeword,  there  exists  a  unique  error  locator 
polynomial  a(x)  of  degree  v  which  is  a  solution  to  the  system  of 
equations  (12).  The  decoding  algorithms  to  be  discussed  in  the 
succeeding  sections  finesse  the  direct  inversion  of  the  syndrome 
matrix,  yet  rely,  to  some  extent,  on  Peterson's  result  which  proves 
the  existence  and  uniqueness  of  the  solution. 

To  complete  the  decoding  after  finding  A ( x)  we  need  to 
determine  the  error  locations  and,  if  q  >  2,  calculate  the  error 
magnitudes  .  Since  A(x)  was  defined  as  the  polynomial  having 
zeros  at  the  inverses  of  the  error  locations,  these  inverses  can  be 
obtained  by  evaluating  a(x)  for  all  field  elements  of  GF(qm). 

This  brute  force  solution  was  first  proposed  by  Chien  [8]  and  is 
known  as  a  Chien  search.  It  can  be  efficiently  implemented  as  a 
finite  field  transform  [17], 

When  q  >  2,  the  error  magnitudes  can  be  obtained  from  the  2t 
equations  (9)  by  substituting  the  v  determined  error  locations  X* 
into  the  first  v  equations  and  inverting  the  v  x  v  matrix 
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whose  determinant  is  equal  to  X1X2...XV  •  |v|  ,  where  Y  is  again 

the  Vandermonde  matrix,  yielding  (Yi,Y2 . YV)T  =  x-1  • 

( Si ,S2 .... ,SV)T.  However,  a  more  efficient  algorithm  has  been 
given  by  Forney  [9],  based  on  the  direct  computation  of  the  Y^ 
from  the  error  locator  polynomial  and  the  error  evaluator 
polynomial . 

Let  the  syndrome  polynomial  $(x)  be  defined  by 


2t  .  .  2t  v  ,  ,  , 

S(  x )  =  T  S.xJ_1  =  l  l  Y,XJ.xJ-1 
j=l  0  j=l  i  =  l  1  1 

and  let  the  error  evaluator  polynomial  n(x)  be  defined  by 


Q{x)  =  S(x)a(x)  x2t« 


Equation  (14)  is  known  as  the  "key  equation"  for  BCH  decoding.  (It 
differs  slightly  from  the  form  given  by  Berlekamp  in  [10],  as  the 
syndrome  polynomial  has  been  defined  differently,  allowing  a  slight 
simplification  of  Forney's  formula.) 
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3.2  EXTENSION  OF  THE  CLASSICAL  THEORY 

In  this  section,  an  alternative  to  Forney's  formula  (16)  for 
the  error  magnitudes  is  developed.  To  facilitate  this  discussion, 
the  concept  of  reversal  of  a  polynomial  is  introduced.  If 


p(  x )  =  Y  p-jxJ 
0=0 


is  a  polynomial  over  some  field,  then  the  reversal  of  p(x),  denoted 
by  p(x),  is  defined  by 

n 

p(x)  =  Y  Pn-jx^  =  xnp(l/x)  (17) 

0=0 

i.e.,  p ( x )  has  the  same  coefficients  as  pfx)  but  in  reverse  order. 
Let 


a(x)  *  l  ajxJ 


n2 

b( x)  «  l  bjxJ 
0=0 


By  definition  of  polynomial  multiplication,  their  product  is  given 
by 


a(x)b(x)  =  c(x)  =  Y  c-ixJ 
0=0  J 


it 
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n  =  +  n2 


cj  =  f,  aibj-i 
1=0 


i(x)  =  ^an,-jxJ 

j=0  ni  J 


5(x)  =  l  b  .x^ 
j=0  n2‘J 


so  that 


a(x)6(x)  =  d(x)  =  T  d.x^ 


j=0  J 


where 


n  =  +  n2 
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i.e.,  dj  is  the  sum  of  »1J_  products  of  the  form  whose 
indices  k  and  si  sum  to 

n,  -  i  +  n2  -  j  +  i  =  n  -  j. 


But 


a(x)b(x)  =  c(x) 


n 


j 


where 


n-0 

c  .  -  )  a.b  .  . 
n-0  jl0  1  n-j-i 

i.e.,  cn_j  is  the  sum  of  al 1  products  of  the  form 
a^b^  whose  indices  sum  to 

i  +  n  -  j  -  i  =  n  -  j. 


Hence, 


Similarly,  if  c'(x)  =  gcd(?(x),g(x)),  then  c'(x)  is  a  common  divisor 
of  f ( x )  and  g(x).  Thus 

c'{x)  =  8c(x) 

for  some  field  element  8. 

An  alternative  formula  to  (16)  can  now  be  developed  in  terms  of 
reversals.  Let 


A ( x )  =  [ (A(x)S(x) )/x^xj. 


A(x)S(x)  =  xaMx)  +  Q(x).  (20) 

The  polynomials  A ( x )  and  Q(x)  contain  equivalent  information  in  that 
the  syndrome  polynomial  can  be  recovered  either  from  the  pair 
(a(x),Mx))  or  from  the  pair  (a(x).Q(x)): 

s(x)  =  x2-~ (x{x7_°(x)  =  |(x2tA(x))/A(x)J  (21) 

since  deg( A ( x ) )  =  v  and  deg(Q(x))  <  v  -  1.  Similarly, 


5 ( x )  =  x  -Q(x)  *-*(x)  =  [(x2tQ(x))/A(x)J. 
A  ( x ) 


■  v/xy 
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The  degrees  of  the  polynomials  in  (21)  and  (22)  are  defined  as 
follows.  By  (10),  A(x)  has  degree  v,  so 

A(x)  =  XV  +  A1XV'1  +  ...  +  A  ,X  +  A  . 

1  v-l  V 

The  deg(S(x))  is  defined  as  2t  -  1  (even  when  S2t  =  0)  and 
deg(Q(x))  as  v  -  1  (even  when  qv_i  =  0)  so  that 

$(x)  =  SiX2t  1  +  ...  +  S2t 

and 

Q  ( x )  =  Q0xv_1  +  QiXV"2  +  ...  +  *^v_2x  +  Qv  1’ 

Finally,  a(x)S(x)  is  defined  to  have  degree  v  +  2t  -  1  and  A(x)  is 
defined  to  have  degree  v  -  1  so  that 

Mk)  =  A0xv-1  +  A^-2  +  ...  +  Av_2x  +  A 

Example  3:  Reed-Sol omon  code  defined  over  GF(ll)  with  a  =  2,  t  =  3, 
g(x)  =  (x  -  2 ) ( x  -  4 ) ( x  -  8) (x  -  5)(x  -  10 ) (x  -  9) 

=  x6  +  6x5  +  5xh  +  7x3  +  2x2  +  8x  +  2. 

Let  c(x)  =  0,  v(x)  =  e(x)  *  6x9  +  5x8, 


33 


From  the  definition  of  the  reversal  (17)  and  (10) 


A(x)  =  n  (1  -  X.x) 
i=l  1 


by  (18) 


=  TT  (-X,  +  X) 
1=1  1 


=  n  (-X,)(l  -  xTxx) 
i=l  1  1 


=  n  (-X.)  n  (1  -  xTAx) 
i=l  1  i=l  1 


=  a  n  (l  -  xTAx). 

vi=l  1 


Equation  (23)  says  that  A(x)  is  an  unnormalized  error  locator 
polynomial  for  an  error  polynomial  with  errors  at  the  Inverse  error 

locations  Xj1  (i  =  1,  ....  v).  This  will  be  used  in  deriving  the 
new  formula  for  error  magnitudes. 

It  hes  been  shown  In  (21)  and  (22)  that  the  polynomials  A(x) 
and  q(x)  are  equivalent  In  the  sense  that  either  can  be  used  In 
conjunction  with  a(x)  to  obtain  the  syndrome  polynomial  S(x).  This 
equivalence  between  A(x)  and  n(x)  suggests  that  an  alternative  to 
Forney's  formula  (16)  can  be  derived  in  terms  of  the  polynomial 
A(x).  We  now  show  that 
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where  d  =  2t  +  1  is  the  designed  distance  of  the  code.  Define 


e*(x)  .  "w  ,«'"-1)dx1 
1=0  n_1 


where  all  exponents  and  subscripts  are  taken  modulo  n.  Let  X*j 
and  Y*j  denote  the  error  location  numbers  (field  elements)  and 
error  magnitudes  specified  by  e*(x),  but,  for  convenience,  indexed 
to  correspond  directly  to  Xi  and  Yi.  For  example,  if  over 
GF (11)  with  a  =  2,  e(x)  =  6x9  +  5x8  +  3x3,  then  e*(x)  =  6x7  +  x2  + 
4x  and  X*!  =  a7,  X*2  =  a2,  X*3  =  a1,  Y*j  =  6,  Y*2  =  1,  Y*3  =  4. 
Next,  let  S*(x)  denote  the  syndrome  polynomial  defined  with  respect 
to  the  error  polynomial  e*(x): 


S*.  =  e*(ctJ) 


(n-i)d  ij 
>  e„  .a  ~ 

1=0  n"1 


a 


n-1 

I  en  ■5° 
1=0 


i( j-d) 


(=0  ' 


sd-j 
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Therefore,  $*(x)  =  S(x). 

i 

Let  a*(x)  denote  the  error  locator  polynomial  defined  w.r.t. 
the  error  polynomial  e*(x)  and  let  Q*(x)  denote  the  corresponding 
error  evaluator  polynomial.  Then 


a*(x)  =  n  (l  -  x*. x) 
i=l  1 


x*i  ■  Xi1 


so  that 


A*{x)  =  n  (1  -  XT^x )  =  A-1A(x) 
i  =  l  1  v 


by  (23).  Then 


Q*(x)  =  |a*(x)S*(x)|  2t 
=  A”1 |K(x)5{x) |  2t 
=  a_1K(x) . 


By  Forney  s  formula 


VV 


But  if  Xj  =  ak,  then 


Q*(X*Ta) 
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A'(X*TA) 
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=  Y  X^ 

TT 


Therefore 


*(X.) 


iL  (x?)-1 


A'(X.)  J 
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fiV.n-d 

A*  (Xj  >  J 


giving  us  an  alternative  to  (16)  for  calculating  the  error 
magnitudes  Yj. 
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Example  4:  Reed-Sol  onion  3-error-correcting  code  over  GF  (11). 
Let  a  -  2,  c(x)  =  0,  v(x)  =  e(x)  =  6x9  +  5x6  +  3x3. 


Then  S(x)  =  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9. 


A.  ( x )  =  10x3  +  2x2  +  5x  +  1 
\ * ( x )  =  8x2  +  4x  +  5 


°(x)  =  3x  +  3x  +  9 

By  Forney's  formula  (16) 


A( x)  =  x3  +  5x2  +  2x  +  10 

A  * ( x )  =  3x2  +  lOx  +  2 

£{x)  =  2x  +  1 


Yi  =  - 


0(a7)  _  3a4  +  3a7  +  9  _ 


\  '  (  a7  )  8a4  +  4cr7  +  5 


4+10+9 
7  +  6  +  5 


-  =  3 


Y2  =  - 

Y3  =  - 


3f4  +  3  a2  +  9 


A'f't2) 

P(aM 


8a4  +  4a2  +  5 


4  +  1  +  9 
7  +  5  +  5 


-  -  5 

6 


3a2  +  3a1  +  9 


A* (a1 ) 

By  the  new  formula  (24) 


8a2  +  4a1  +  5 


1  +  6  +  9 
10  +  8  +  5 


=  -5=6 


SU3) 


A 1  ( a3  ) 


3*3- 

••a3  3  =  - - 


2a  +  1  9 

- .a  = 


5  +  1 


3a6  +  10a3  +  2 


5  +  3  +  2 


Y«  = 

._8 • 3  - 

2a8  +  1 

T  2 

A  *  (  a8  ) 

3a6  +  10a8 

+  2 

Y- = 

!(«9)  9.3  - 

2a9  +  1 

T  3 

A 1  ( a9  ) 

3a8  +  10a9 

+  2 

6  +  1 


5  +  8  +  2 


1  +  1 


9  +  5  +  2 
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Example  5:  BCH  (15,5)  3-error-correcting  code 


Let  a  be  a  root  of  f ( x )  =  x4  +  x  +  l  =  0  over  GF(2), 
Let  e(x)  =  x14  +  x12  +  x9 . 

Then  S(x)  =  a12x5  +  alux4  +  a9x3  +  a6X2  +  a12X  +  a6, 


M  x) 

A'(x) 

O(x) 


rt5X3  +  alux2  +  (t6X  +  1 

A  ( X ) 

=  X3  +  a6x2  +  a10x  + 

a5x2  +  cr6 

A '  ( x) 

=  x2  + 

a6x2  +  a6 

K(x) 

in?  a 

=  a  x  +  a  X  + 

A'(x) 

's  formula  (16) 

^(XT1 ) 

Yi  = - ~r-  s  L 

J  a'U*1) 


By  the  new  formula  (24) 

iLilL  « 

Y i  =  -  •<*' 

A‘(«9) 


5(a12) 

A'(a12) 


a 1 9  a 1 8  +  a9a9  +  a2 

- -a72 

a18  ♦  a10 


=  +  a!..g12  =  -°lL.a12 

a3  +  a10  a12 


1 0  24  9  12  ,  2 

i  a  +  a  a  +  a 


a24  +  tx10 


a4  +  n:6  +  a2  6  a7  6 

- - -  -  •  (Y  “  - -•  (7 

+  a10  a13 
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*U14) 

1028  .  9 

a  a  +  a  a 

= 

r 1 /  14  \ 

Ala  ) 

a28  +  a10 

a8  +  a8  +  a2  7  a2  7  , 

=  - »a  =  — -•<%  =  1 

a13  +  a10  a9 


The  algorithms  given  in  succeeding  sections  solve  the  key 
equation  (14)  for  a ( x )  and,  either  directly  or  indirectly,  for  q(x) 
and  A ( x )  as  well.  Decoding  is  then  completed  by  applying  a  Chien 

search  to  A ( x )  to  determine  the  error  locations  Xj  (j  =  1 . v) 

and  then  employing  either  (16)  or  (24)  to  obtain  the  error 
magnitudes  Yj  (j  =  1,  ....  v). 
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SECTION  4 

THE  JAPANESE  DECODING  ALGORITHM 

In  this  section  the  decoding  algorithm  of  Sugiyama,  Kasahara, 
Hirasawa,  and  Namekawa  [12],  which  we  call  the  Japanese  decoding 
algorithm,  is  reviewed.  It  is  known  that  the  key  equation  (14)  has 
a  solution  in  polynomials  Q(x)  and  A(x),  with 

deg(/v(x))  _<  t  and  deg(r?(x) )  _<  t  -  1 

(so  long  as  the  number  of  errors  v  <_  t).  The  strategy  of  Sugiyama, 
et  al.,  is  to  apply  Euclid's  algorithm  with  g(x)  =  S(x)  and  f(x)  = 
x2t.  Any  nonzero  scalar  multiple  of  x2t  will  do  just  as  well, 
since  the  term  ak(x)f(x)  in  the  Euclidean  relation 


rk(x)  =  ak(x)f(x)  +  bk ( x ) g ( x ) 


will  disappear  modulo  x2t.  The  choice  f(x)  =  -x2t  will  be 
adopted  in  this  section.  This  is  done  simply  for  convenience  so 
that  at  termination  ak(x)  becomes  the  polynomial  A(x)  of  (19). 
With  f(x)  =  -x2t  expression  (25)  becomes 

rk(x)  =  -ak(x)x2t  +  bk(x)S(x), 

where  rk(x)  denotes  the  polynomial  r^(x)  defined  in  the  k^ 
Iteration,  etc.  Reducing  modulo  x2t  yields 


rk(x)  =  jbk(x)S(x}|  2t< 
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Thus,  if  Euclid  s  algorithm  can  be  terminated  in  such  a  way  that  the 
degrees  of  rk(x)  and  bk(x)  at  termination  satisfy  the  conditions 
for  the  degrees  of  q(x)  and  Aix),  respectively,  then  it  provides  an 
effective  means  for  solving  the  key  equation. 

Recall  that,  in  program  3,  bk(x)  is  the  value  of  bN(x) 
defined  at  the  k*h  iteration;  it  follows  that 

.  k 

deg(bK(x))  =  l  deg(q1(x)). 


Also, 


qk(x)  =  rk_2(x)/rk_1(x) . 


Thus, 


deg( qk ( x ) )  =  deg(rk_2(x))  -  deg(rk-1(x)) 


deg(rk~l (x) )  =  deg(rk_2(x))  -  deg(qk(x)) 

=  deg(rk_3(x))  -  (deg(qk(x))  +  deg(qk_1 (x) ) ) 


=  deg(r_1(x))  -  l  deg(q^ ( x) ) 


=  deg(f(x))  -  deg(bk(x)). 
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If  Euclid  s  algorithm  is  terminated  at  the  least  value  of  k  >  0 
for  which  deg(rMx))  <  t,  then 


deg{rk_1(x))  >  t 


which  implies  that 


deg(bk(x))  =  deg( f (x ) )  -  deg(rk-1(x))  <  2t  -  t  =  t. 


Therefore,  rk(x)  and  bk { x )  satisfy  the  degree  conditions  for 
Q(x)  and  a(x),  respectively. 

Suppose,  however,  that  Q*(x)  and  A*(x)  are  another  pair  of 
solutions  to  (14),  with 


deg(A*(x))  <  deg(t>(x)). 


rk(x)A*{x)  =  -ak(x)x2tA*(x)  +  bk(x)S(x)A*(x) 
Q*(x)bk(x)  =  -A*(x)x2tbk(x)  +  A*(x)S(x)bk(x) 


| rk(xh*(x) |  2t  =  | Q*( x ) bK ( x ) |  2t< 
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However, 


deg(rk(x))  <  t,  deg(A*(x))  £  t  =>  deg(rk(x)A*(x))  <  2t 


deg(Q*(x) )  <  t,  deg(bk(x))  <  t  =>  deg(Q*(x)bk(x) )  <  2t. 


Therefore, 


rk(x)A*(x)  =  Q*(x)bk(x). 


Combining  (26)  and  (27), 


ak(x)A*(x)  =  A*(x)bk(x) 


ik(x)/A*(x)  =  bk(x)/A*(x)  =  h(x) , 


say,  with 


deg(h(x))  >  0. 
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a  (x)  =  h(x)Q*(x) 
bk(x)  =  h(x)A*(x) 

and  h(x)  is  a  common  divisor  of  ak(x)  and  bk(x).  But  by  (8), 
gcd(ak(x) ,bk(x) )  =  9,  a  scalar,  leading  to  a  contradiction. 
Therefore,  there  can  be  no  solutions  to  (14)  of  lower  degree  than 
rk(x)  and  bk(x).  Thus,  rk(x)  and  bk(x)  are  the  desired 
solutions  Q( x)  and  A(x),  except  for  possible  multiplication  by  a 
scale  factor  y. 

To  obtain  a0  =  1,  choose  y  =  b0  and  define 


Y~  1b*c  ( x ) 


q(x)  =  Y_1rk(x). 

The  polynomial  a(x)  need  not  be  computed  in  program  3  as  it  is  not 
used;  if  it  is  computed,  then  at  termination  ak(x)  is  a  scalar 
multiple  of  the  polynomial  A( x)  and  is  the  part  of  the  product 
bk(x)$(x)  excised  by  the  residue  taking  operation: 


ak(x)  =  L(bk(x)S(x))/x2tJ  =  yMx). 
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Program  4  can  be  used  to  solve  the  key  equation  for  a(x)  and 
q(x)  by  the  Japanese  algorithm.  The  termination  box  can  be  omitted; 
it  serves  no  useful  purpose.  Neither  the  Chien  search  nor  the 
calculation  of  the  error  magnitudes  by  Forney's  algorithm  is 
affected  by  multiplication  of  A ( x)  and  Q(x)  by  the  same  nonzero 
field  element. 

For  an  illustration  of  the  execution  of  program  4,  let  us  again 
call  on  the  3-error-correcting  Reed-Solomon  code  over  GF (11)  with 
a  =  2  that  was  used  in  examples  3  and  4. 

Example  6:  g(x)  =  (x  -  2 ) ( x  -  4 ) ( x  -  8 ) { x  -  5 ) ( x  -  10 ) ( x  -  9) 

=  x6  +  6x5  +  5x4  +  7x3  +  2x2  +  8x  +  2. 

Let  c(x)  =  0,  v{x)  =  e(x)  =  6x9  +  5x8  +  3x3. 

Then  S(x)  *  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9. 

Values  of  rN(x),  q(x),  and  bN(x)  are  shown  in  the  following 
table  for  successive  iterations  k  of  program  4: 


10x5+  7x4+9 
8x4+6 
9 


q(x)  -  lr°(x)/r~(K)j 
y(x)  -  r°(x)  -  q(x)rN(x) 
r°(x)  -  r*(x) 
r^(x)  -  y(x) 
y(x)  -  b°(x)  -  q(x)bN(x) 
b°(x)  -  bN(x) 
bN(x)  -  y(x) 

DEG^x))  :  t 


Therefore,  the  inverse  error  locations  are  given  by  {a1, a2, a7}  and 
the  error  locations  are  given  by  {a9, a8, a3}.  The  error  magnitudes 
are  calculated  by  either  Forney's  formula  (16)  or  the  new  formula 
(24)  as  In  example  4. 

To  assist  In  comparing  decoding  algorithms,  estimates  can  be 
made  of  the  number  of  scalar  multiplications  required  for  correcting 
t  errors.  For  program  4,  it  is  assumed  for  this  analysis  that 
deg(q(x))  is  always  1.  In  that  case  t  iterations  will  be  required. 
Each  Iteration  involves  two  multi  plications  of  r^(x)  by  a  scalar 
and  two  multiplications  of  bN(x)  by  a  scalar.  The  polynomial 
rN(x)  initially  has  degree  2t  -  1  (i.e.,  2t  terms),  and  at  the 
last  iteration  has  degree  t  (i.e.,  t  +  1  terms).  The  number  of 
multiplications  required  to  update  rN(x),  then,  is 

2t 

2  l  i  =  2t(t  +  1  +  2t)/2  =  3t2  +  t. 
i=t+l 

The  polynomial  bN(x)  initially  has  degree  0  (i.e.,  one  term),  and 
at  the  last  iteration  has  degree  t  -  1  (i.e.,  t  terms).  The  number 
of  multiplications  required  to  update  bN(x),  then,  is 

2t 

2  l  1  =  2 t { 1  +  t)/2  =  t2  +  t. 
i=l 

(Another  t2  -  t  would  be  required  if  a(x)  were  also  retained.) 
Program  4,  then,  needs  on  the  order  of  4t2  multiplications  for 
determining  a(x)  when  t  errors  occur. 
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In  addition  to  estimating  the  number  of  scalar  multiplications 
it  is  important  to  provide  some  estimate  of  the  number  of  basic  time 
units  required  by  the  algorithm,  corresponding,  roughly,  to  the 
number  of  levels  which  might  be  required  In  a  systolic  array 
Implementation.  For  this  analysis  it  is  again  assumed  that  t  errors 
are  to  be  corrected  and  that  deg(q(x))  is  always  1.  In  this  case,  t 
iterations  of  the  algorithm  are  required,  each  Involving  two 
multiplications  of  rN( x)  and  bN(x)  by  q(x)  =  qjx  +  q0.  The 
important  question  to  answer  is  whether  the  updates  of  rN(x)  and 
bN(x)  can  proceed  simultaneously.  At  the  start  of  an  iteration  qx 
is  immediately  available  as  the  ratio  of  the  leading  terms  of 
r°(x)  and  rN(x).  Thus,  both  the  update  of  rN( x)  and  bN(x) 
can  be  initiated  using  qx,  and  all  terms  of  each  polynomial  can  be 
multiplied  at  the  same  time  and  then  subtracted  from  the  appropriate 
corresponding  terms  of  r°(x)  and  b°(x).  At  completion,  q0  is 
then  available  for  completion  of  the  division  and  polynomial 
updates.  The  algorithm  thus  requires  2t  basic  time  units  (systolic 
array  levels),  where  a  basic  time  unit  includes  the  time  required 
for  one  finite  field  scalar  division,  one  scalar  multiplication,  and 
one  subtraction.  The  number  of  cells  required  at  each  level  is 


deg( rN( x ) )  +  1  +  deg(bN( x ) )  +  1  =  2t  +  1 


resulting  again  In  a  total  of  4t2  +  2t  multiplications.  This 
assumes  that  the  additional  cells  needed  for  bN(x)  as  It  grows  In 
length  can  be  supplied  from  the  cells  no  longer  needed  for  r^(x) 


as  it  shrinks. 


In  conclusion,  then,  the  Japanese  algorithm  requires  on  the 
order  of  4t2  multiplications  and  2t  basic  time  units  for  decoding  t 
errors  in  a  codeword  of  length  n. 
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SECTION  5 


MILLS’  CONTINUED  FRACTION  ALGORITHM 

Continued  fraction  approximations  are  closely  related  to 
Euclid's  algorithm.  In  [13]  Mills  developed  a  oontinued  fraction 
algorithm  for  solving  linear  recurrences.  This  algorithm  can  be 
applied  to  solving  the  key  equation  (14)  for  BCH  decoding. 

Following  the  results  of  Welch  and  Scholtz  [14],  we  show  that  the 
recursion  employed  in  Mills'  algorithm  is  the  Euclidean  recursion 
(1),  and  that,  in  the  decoding  context,  the  algorithm  is  essentially 
the  same  as  the  Japanese  decoding  algorithm  with  different  initiali¬ 
zation. 

This  section  is  divided  into  two  parts.  In  section  5.1  the 
relation  of  continued  fractions  to  Euclid's  algorithm  is  explored. 

In  section  5.2  the  decoding  of  BCH  codes  by  Mills'  algorithm  is 
discussed. 

5.1  CONTINUED  FRACTIONS  AND  EUCLID'S  ALGORITHM 

Mills  considers  a  continued  fraction  expansion  of  the  form 


a  =  yi  + 


where  yj,  y2,  y3,  .... 


are  elements  from  some  field.  We  can  write 


w 


and  so  forth.  (In  the  classical  case  a  Is  a  real  number,  the  yj 
are  Integers,  and  0  <  z\  <  1  for  all  1.)  If  ^  *  0  for  some  m, 
then  the  continued  fraction  terminates  with  ym.  Otherwise,  the 
continued  fraction  Is  Infinite.  The  kth  continued  fraction 
approximation  u^  to  a  is  defined  by  setting  *  0: 


K  =  yi 


1 


y2  + 


(30) 


Suppose  a  is  rational,  i.e.,  a  =  s/t  for  integers  s  and  t,  with 
s  >  t.  Then 


yi  =  Ls/t j 

Z\  -  s/t  -  I s/t  (  *  (s  -  f  s/t  (t)/t 


I.e.,  yi  is  Identical  to  qx  in  Euclid's  algorithm  and  zi  is  identi¬ 
cal  to  ri/t.  Continuing, 


y2  =  I  l/zj  =  q2 
z 2  =  1/zi  -  y2  =  r2/ri 


and  so  forth.  Taking  the  continued  fraction  expansion  of  a  rational 
number  s/t  Is  the  same  as  applying  Euclid’s  algorithm  to  find  the 
gcd(s,t).  Let  us  repeat  example  1  from  section  2  as  a  continued 
fraction  expansion  of  9022/4719. 


yi 

,-%j»  V  V 1 


Example  7 :  s  =  9022,  t  =  4719.  s/t  =  1.9118457300275... 


Iteration  1:  s/t  =  yi  + 

yL  =  [9022/4719  ]  =  1 

Zi  =  4303/4719  =  .911845730027... 


The  first  continued  fraction  approximation  to  s/t  is  obtained  by 
setting  Zj  equal  to  0: 

ui  =  yi  =  1. 


Iteration  2:  s/t  =  yi  +  l/(yo  +  z2) 

y2  =  [ 1/z ! J  =  [47 19/4303 J  =  1 
z2  =  416/4303  =  .096676737160... 


The  second  continued  fraction  approximation  to  s/t  is  obtained  by 
setting  z2  to  0: 

U2  =  yi  +  l/y2  =  1  +  1/1  =  2. 


w? 

m 


tsesf 


Iteration  3:  s/t  =  yx  +  - — x —  , 


y3  +  *3 

y3  *  ll/z2J  «  [4303/4161  =  10 

z3  =  143/416  =  .34375 

U3  =  +  72"i~r  =  1  +  TT1 

*3  TB 


1  +  yy  =  yy  =  1.9090... 
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Iteration  4:  y4  =  | 416/143 J  =  2 

z4  =  130/143  =  .9090... 

u4  =  44/23  =  1.913043478260869... 


Iteration  5:  y5  =  | 143/130 J  =  1 
z5  =  13/130  =  .1 

u5  =  65/34  =  1.91176470588235... 


Iteration  6:  y6  =  ) 130/13 j  =  10 
26  -  0 

u6  =  694/363  =  s/t  =  1.9118457300275... 


The  expansion  terminates  at  y6.  The  equivalence  with  Euclid's 
algorithm  shows  that  termination  must  always  occur  when  a  is 
rational,  for  the  rk  form  a  strictly  decreasing  sequence  of 
nonnegative  integers  which  must  eventually,  for  some  finite  m, 
satisfy  rm  =  0. 


It  may  also  be  observed  in  example  7  that  uk 
That  this  is  plausible  follows  from  equation  (4): 


■  -bk/ak. 


implies  that 


V  +  bkt  =  rk 


-bk/ak  =  s/t  -  rk/(tak) 


-ak/bk  =  t/s  -  rk/(sbk). 


Thus  the  error,  say  e^,  of  the  approximation  -b^/a^  to  s/t  is 
given  by 


rk/(tak). 


When  the  expansion  terminates  with  rn  =  0  for  some  n,  en  =  0  and 
the  final  approximation  -bn/an  is  exact.  Furthermore,  by  (5), 
gcoUfcjbk)  =  1  so  that  un  =  -bn/an  is  equal  to  s/t  reduced 
by  cancellation  of  all  common  factors,  i.e., 


| bn  j  =  s/gcd(s,t) 
|  an |  =  t/gcd(s,t) 


More  precisely,  a  comparison  of  examples  7  and  1  shows  that  the 
relationship  between  the  approximation  u|<  in  Mills'  algorithm  and 
the  quantities  a^  and  b^  in  the  extended  Euclid's  algorithm  can 
be  expressed  by 


(-i)kbk 

Uk  T-T7k+fa, 


e.g., 
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U4  =  13  =  ^au 


V  A.'i.N  /i  .»  .S  .<  A  A  _%  -S  /i  L“w  *■  A  A  A  **.  ,%  ,N  A  >,  •  A  .  «  •  •  .  »  -  - 


HR 


Following  Welch  and  Scholtz  [14],  we  now  show  that  adoption  of  the 
relation  (35)  leads  to  the  Euclidean  recursion  (3)  for  a^  and 
bfc.  Equation  (29)  can  be  written  as 


s 

I 


yi 


y2 


_ i_ 

+  yk  +  zk 


(36) 


By  a  process  of  rationalization,  (36)  can  always  be  manipulated  into 
the  form 


s  _  Ak(yk  +  zk)  +  Bk 
T  =  Ck(yk  +  V  +  °k 


(37) 


where  the  functions  A^,  B^,  C^,  and  involve  products  of 
the  yj  for  j  <  k.  Substituting  k  +  1  for  k  in  (37)  gives 


s  =  Ak+l(yk+l  *  zk+l*  *  Bk+1 
T  Ck+llyk+l  +  zk+l 1  +  Dk+1  ’ 


(38) 


On  the  other  hand,  replacing  z^  by  (y|(+i  +  Zk+1^”1  (37) 

yields,  after  rationalization. 


s  =  (Akyk  *  Bk)(yk+l  *  zk+l*  *  Ak 
I  (Ckyk  +  DkMyk+l  +  zk+l)  +  ck  * 


(39) 


The  kt(1  continued  fraction  approximation  to  s/t  is  obtained  by 
setting  z^  to  0  in  (37): 


=  Vic  4  Bk 

^  Vic  +  Dk 


Combining  (35)  and  (40)  we  set 


(-1}  bk s  Vk +  Bk 

<-l>k+1*k  ■  Vk +  Dk 


\ '  *k 


Ak+l=Akyk+Bk  (‘n  bk 


Bk+l=Ak  lbk_! 


■>  (-l)kbk  =  (-D*C'1Vlqk+(“1)l 


bk  =  bk-2  '  qkbk-l 


Ck+rCkyk+Dk=("1)  ak 
Dk+l=Ck  =('1)kak-l 


->  (-l)k+1ak  =  (-Dkak.1qk  +  M) 


ak-2  '  qkak-l 


Equations  (41)  and  (42)  are  the  Euclidean  recursions  of  equation 
(3).  Thus,  taking  a  continued  fraction  expansion  of  a  rational 
number  s/t  Is  Identical  to  finding  the  gcd(s,t)  by  Euclid's 
algorithm,  the  kth  continued  fraction  approximation  being  given  by 
(35).  Initial  conditions  are  obtained  as  follows: 


Ai(yi  +  Zi )  +  Bi 


=  yi  +  *i 


=>  Ax  =  1,  Bji  =  0,  CL  s  0,  Dj  8  1 

=>  bx  =  -q,  ax  =  1 

c  A2(y2  +  z2)  +  B2  y i (y2  +  z2)  +  1 


2W2  +  2.21  +  u2  y2  +  z2 
=>  A2  =  qi,  B2  c  1,  C2  »  1,  D2  -  0 
=>  b2  =  q:q2  +  1,  a2  =  -q2 
a2  =  a0  -  q2a^  =>  ao  =0 
ai  ~  a-i  -  qiao  =>  a- i  =  l 
b2  =  bg  -  q2bi  =>  bo  =1 
bj,  =  b_i  -  q i b o  =>  b. i  =  0 

These  are  the  same  initial  conditions  as  used  for  Euclid's 
algorithm. 


Next,  consider  the  case  where  a  of  equation  (29)  Is  a  ratio  of 
two  polynomials  over  a  finite  field:  a  =  f(x)/g(x).  In  this  case. 
Is  an  Infinite  power  series  in  x,  the  y-j  are  polynomials,  and  the 
zj  are  infinite  power  series  expansions  in  negative  powers  of  x. 


Let  degfe'k)  denote  the  exponent  of  the  first  nonzero  term  of 
e'k  in  (46).  Then,  since 


deg( rk ( x ) )  <  deg(rk-1(x) ) 


deg(bk(x))  >  deg(bk-1(x)) 


it  follows  that 


deg(e’k)  £  deg(e'((_1)  -  2  ( 

that  is,  the  approximations  (43)  and  (44)  match  at  least  two  more 
terms  of  the  series  f(x)/g(x)  or  the  series  g(x)/f(x)  at  each 
iteration. 

Corresponding  to  (35),  the  kth  continued  fraction  approxima¬ 
tion  to  f(x)/g(x)  is  given  by 


(-l)kbk(x) 


k  (-l)k+1ak(x) 


and  since  by  (8)  gcd(ak(x),bk(x))  =  y  for  some  field  element  y, 
(48)  is  the  approximation  to  f(x)/g(x)  with  polynomials  ak(x)  and 
bk(x)  of  lowest  degree  possible. 

Finally,  by  setting 


(-l)kbk(x)  =  A|cyk  +  Bk 


m 


I 

i 


^VYTV'Jr.nA  7&\?  VO A>  V'  ?  7. 


(-l)k+1ak(x)  -  Ckyk  t  Ok 


\(x)  =  *k 

In  (40),  one  obtains  the  Euclidean  recursion  for  polynomials: 
bk(x)  =  bk"2(x)  -  qk(x)bk_1(x) 

(45 

ak(x)  =  ak_2(x)  -  qk(x)ak_1(x) 

For  illustrative  purposes,  we  repeat  example  2  from  section  2  as  a 
continued  fraction  expansion  of  f(x)/g(x): 

Example  8:  f(x)  =  x5  +  3x4  +  3x2  +  5x  +  10 
g(x)  ■  2x2  +  7x  +  3 

over  GF (11). 


Iteration  1:  f(x)/g(x)  =  +  Zi 

yi  =  [f(x)/g(x)j  =  6x3  +  8x2  +  7x  +  9 
Zi  =  ( 9x  +  5)/(2x2  +  7x  +  3) 


The  first  continued  fraction  approximation  to  f(x)/g(x)  Is  obtained 
by  setting  zx  equal  to  0: 


Ui  =  yi  =  6x3  +  8x2  +  7x  +  9  *  -b1(x)/a1(x) 


Iteration  2:  f(x)/g(x)  =  yi  +  l/(y2  +  z2) 

y2  =  ll/ZiJ  =  lOx  +  5 
z2  =  0 


u2  =  yi  +  l/y2  =  (5x  +  4x  +  2)/( lOx  +  5) 
=  b2(x)/(-a2(x)) 


In  section  5.2  it  Is  shown  how  continued  fraction  expansions 
can  be  used  for  solving  the  key  equation. 

5.2  DECODING  BCH  CODES  BY  MILLS'  ALGORITHM 

In  section  5.1,  the  equivalence  of  a  continued  fraction  expan¬ 
sion  of  a  ratio  of  polynomials  to  Euclid's  algorithm  has  been  demon¬ 
strated.  In  this  section  our  purpose  is  to  show  how  a  continued 
fraction  expansion  can  be  employed  for  solving  the  key  equation 
(14).  This  Is  narrower  than  the  question  addressed  by  Mills  In 
[13],  which  is  to  find  a  continued  fraction  expansion  for  an 
infinite  power  series  defined  from  an  arbitrary  infinite  sequence 
s0,  Si,  ....  of  elements  from  some  field.  However,  the  expansion  we 
use  is  the  same  as  Mills',  and  we  shall  call  the  resulting  algo¬ 
rithm,  which  is  very  similar  to  the  Japanese  algorithm.  Mills' 
decoding  algorithm. 

In  section  3.2  it  was  seen  that  the  reversal  5(x)  of  the 
syndrome  polynomial  is  given  by  (22) 

5(x)  =  x2ta(x)  +  A(x) 

A(x) 

so  that 

5(x)  _  q(x)  +  S(x)/x2t 
x2t  a{ x) 

But,  if  we  set  f(x)  =  -x2t,  g(x)  =  5(x),  then  by  (44) 
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where  the  kth  approximation  to  $(x)/x2t  is  given  by 
ak(x)/bk(x) .  For  this  approximation  to  represent  a  solution  of 
the  key  equation,  with  o(x)  =  y_1a k(x),  A(x)  =  Y_1bk(x),  and 
S( x)  =  y_1rk(x),  it  is  required  that 


deg(rk(x))  <  t 

deg(ak(x))  <  t 


and 


deg(bk(x) )  <_  t. 

Let  K  be  the  first  index  k  for  which  deg(rk(x))  <  t.  Then, 
exactly  as  in  section  4, 


and 


deg( bK ( x) )  =  deg(x2t)  -  deg(rK_1(x)) 
<_  2t  -  t  =  t 

deg( aK ( x) )  <  deg{bK(x)} 


so  that  ak(x),  bk(x),  and  rK(x)  satisfy  the  degree 
requirements.  Furthermore,  by  (8), 


gcd{aK(x),bK(x))  =  y 
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for  some  field  element  y,  so  that  any  other  solution  to  (14)  is  some 
polynomial  multiple  of  aK(x),  bK(x),  and  r*(x).  Thus,  the 
polynomials  aK(x),  bK(x),  and  rK(x)  are  the  desired  Q(x),  a(x) 
and  X(x),  except  for  possible  multiplication  by  a  scalar. 

Program  5  is  a  representation  of  a  BCH  decoding  algorithm  based 
upon  Mills'  continued  fraction  algorithm.  The  only  significant 
differences  between  program  5  and  program  4  are  in  the  initializa¬ 
tion  of  r°(x)  and  r^(x)  and  in  the  termination.  In  accordance 

with  the  definition  of  5(x),  the  syndrome  values  defining  the 
polynomial  rO(x)  are  in  reverse  order  of  those  defining  r^(x)  in 
the  Japanese  algorithm.  At  termination,  the  program  furnishes 
scalar  multiples  of  the  reversed  polynomials  5(x)  and  A(x). 

For  illustration  we  use  the  same  example  employed  in  section  4, 
based  on  the  3-error-correcting  Reed- Solomon  code  generated  by 

g(x)  =  x6  +  6x5  +  5x4  +  7x3  +  2x2  +  8x  +  2 

over  GF (11). 

Example  9:  Let  c(x)  =  0,  v(x)  =  e ( x )  =  6x9  +  5x8  +  3x3. 


q(x) 

Li°(x)/rN(x)J 

y(*) 

-  r°(x)  - 

q(x)r"(x) 

r°(x) 

-  r"(x) 

r"(x) 

*-  y(x) 

y(x) 

-  8°(X)  - 

q(x)aN(x) 

a°(x) 

-  a"(x) 

aN(x) 

-  y(x) 

y(x) 

-  b°(x)  - 

q|x)bN(x) 

b°(x) 

-  bN(x) 

bN(x) 

-  y(*) 

DEG  (r**(x)) 

t 

Win 


Program  5  is  less  efficient  than  program  4  because  a(x)  must 
now  be  retained.  For  correcting  t  errors,  program  5  typically 
requires  3t2  +  t  multiplications  for  r(x),  t2  +  t  for  b(x),  and 
t2  -  t  for  a ( x )  for  a  total  of  the  order  of  5t2.  By  dropping 
unneeded  terms  of  r(x)  this  total  can  be  reduced  to  4t2.  This  is 
discussed  further  in  section  7.3. 


Timing  for  program  5  is  the  same  as  for  program  4,  assuming 
that  a(x)  can  be  updated  simultaneously  with  b(x).  Under  the 
assumption  of  t  errors  and  oeg(q(x))  =  i,  both  programs  require  2t 
basic  time  units  for  their  execution,  where  a  basic  time  unit 
represents  the  time  required  for  one  finite  field  scalar  division, 
one  scalar  multiplication,  and  one  scalar  subtraction. 


In  conclusion.  Mills'  algorithm  and  the  Japanese  decoding  algo 
rithm  are  seen  to  be  versions  of  Euclid's  algorithm  which  differ 
only  in  their  initial  conditions  and  termination  rules.  The 
Japanese  algorithm  gives  identical  results  to  Mills'  algorithm  if 
the  order  of  the  syndromes  is  reversed,  and  vice  versa.  We  are 
really  dealing  with  one  algorithm. 
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SECTION  6 


THE  BERLEKAMP-MASSEY  ALGORITHM 

In  this  section  the  Berlekamp-Massey  decoding  algorithm  is 
reviewed.  The  algorithm  was  originally  formulated  by  Berlekamp  [10] 
for  solving  the  key  equation  (14).  Massey  [11]  rederived  the 
algorithm  as  a  method  for  synthesizing  the  shortest-length  linear 
feedback  shift  register  which  will  generate  a  given  sequence.  We 
shall  follow  Massey's  development. 

Figure  1  is  an  illustration  of  a  linear  feedback  shift-register 


(LFSR)  consisting  of  L  stages, 
is  a  linear  combination. 


Each  input  Sj  to  the  first  stage 


s; =  -,1!  c^-i 

specified  by  the  feedback  polynomial  coefficients  Cj  (i  =  1, 

...,  L),  of  the  preceding  L  inputs  Sj_-j  (i  =  1 . L).  For  our 

purposes,  c-j  and  Si  are  elements  of  the  finite  field  GF(q)  for 
some  q,  a  prime  power.  The  output  from  the  shift  register  is  taken 

from  the  last  (Lth)  stage.  Thus,  the  first  L  outputs  . 

sl-l  are  identical  to  the  initial  contents  of  the  shift  register. 

An  LFSR  is  said  to  generate  a  finite  sequence  . . s^-i 

when  this  sequence  coincides  with  the  first  N  outputs  for  some 
initial  loading  of  the  LFSR.  If  L  >_  N,  the  LFSR  always  generates 
the  sequence,  independent  of  the  coefficients  c-j.  If  L  <  N,  the 
LFSR  generates  the  sequence  if  and  only  if 


S4  *  Vj-I 


=  0  (j  »  L,  L+l,  ....  N— 1 ) . 
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Equation  (51)  is  the  same  as  equation  (11)  relating  the 
syndromes  Sj  to  the  error  locator  polynomial  coefficients  A-,-. 

Thus,  for  known  A,  equation  (11)  is  the  equation  of  an  LFSR  which 
generates  the  syndromes.  We  want  to  find  a  a(x)  with  lowest  degree 
v  (i.e.,  fewest  errors  consistent  with  the  decoding  equations). 
Therefore,  we  seek  the  shortest  LFSR  that  generates  the  sequence  of 
syndromes . 

We  begin  with  the  statement  of  Massey's  Theorem  1  (for  a  proof, 
see  [11]). 

THEOREM:  If  an  LFSR  of  length  L  generates  the  sequence  . . 

sn_i,  but  not  the  sequence  s0,  ....  Sfj,  then  any  LFSR  that 
generates  the  latter  sequence  must  have  length  L'  satisfying 

L'  _>  N  +  1  -  L. 

Let  s  denote  an  infinite  sequence  s0,  . and  let  L^(s) 

denote  the  minimum  of  the  lengths  of  all  LFSR's  that  generate  the 
first  N  symbols  s0,  Sj,  ...,  Sfj_i  of  s.  Then  we  have  the 
following 

COROLLARY:  If  some  LFSR  of  length  L^(s)  generates  s0,  .... 

Sn-i,  but  not  s0,  ...,  sn,  then 

LN+1  ( $)  -  max(L|/s)»  N  +  1  -  LN( s ) ) .  (52) 

Massey's  strategy  Is  to  develop  an  LFSR  synthesis  algorithm  that 
satisfies  the  constraint  (52)  of  the  corollary  by  strict  equality. 


r>  \ 


For  a  given  sequence  s,  let 


Lm(s) 

c(N)(x)  =  1  +  J  c^'x1 
1-1  1 


denote  the  connection  polynomial  of  a  minimum-length  LN(s)  LFSR 
that  generates  s0»  As  an  inductive  hypothesis,  assume 

that  l_m(s)  and  c^)(x)  have  been  found  for  N  =  1,  2 . n  with 

equality  obtaining  in  (52)  for  N  =  1,  2,  ....  n  -  1.  We  seek 
Ln+i ( s )  and  c(n+1)(x)  with  equality  holding  in  (52)  for  the  case 
N  =  n.  From  (51)  we  have 


Ln(  s) 

I 

i=l 


c(n)s-  • 
ci  sj-i 


0  j  =  Ln(s) ,  ...»  n-1 

dn  j  =  n 


where  dn  is  the  discrepancy  between  sn  and  the  (n+l)st  symbol 

generated  by  the  LFSR  of  length  Ln( s )  which  generates  s0 . 

sn_i.  If  dn  =  0,  then  this  LFSR  also  generates  s0 . . 

and  Ln+i ( s)  =  Ln( s )  with  c(n+1)(x)  =  c^nUx).  If  dn  *  0, 

a  new  LFSR  must  be  found  to  generate  . sn.  We  want  to 

construct  this  new  LFSR,  with  connection  polynomial  c^n+1)(x),  to 
sati sfy 

L  .(s)  =  max[L  (s),  n  +  1  -  Ln(s)], 


Ln1,S)eS"«> 

i=l 


cr  ysj_i  =  o 


j  =  L  n+ 1 ( s ) ,  •  • • i  n. 


assey  cleverly  constructs  the  desired  LFSR  by  combining  the  latest 
FSR  with  the  LFSR  which  existed  at  the  time  of  the  last  length 
hange,  using  the  discrepancy,  say  d^,  produced  by  that  LFSR  to 
ancel  out  the  discrepancy  d„  produced  by  the  current  LFSR. 


Define  the  new  connection  polynomial  c(n+D(x)  by 


c"n(x)  .  c(nl(x)  -  <1  d-1x"-mc(m)(x). 

n  m 


Combining  (53),  (54)  and  (55)  yields 

Ln+l(s)  fn+ll 

sj  +  l  ci  sj-i 
i=l 

Ln<s)  ,  x  ,  Lm^s)  xx 

=  sj  +  1  ci  sj-i  “  ^n^m  [sj-n+m  +  1  ci  sj-n+m-il 

i=l  i=l 


0  for  j  =  l_n>  Ln+1,  ....  n-1  \  0  for  j  =  l-l_n+n . n-1 


dn  for  j  =  n 


dra  *  0  for  j  =  n 
m 


0  for  j  =  Ln+1(s),  Ln+1+1,  ....  n-1 


d  -  d  d"  d=  0  for  j  =  n 
n  n  m  m 


The  LFSR  of  length  Ln+i(s)  =  max(Ln(s),  n+l-Ln(s))  generates 

the  n+1  sequence  symbols  s0 . sn  and  satisfies  the  constraint 

(52)  with  strict  equality. 

Figure  2,  adapted  from  Blahut  [17]  figure  7.3,  illustrates  the 
new  shift  register  construction,  n  -  m  new  dunmy  stages  are  added 
to  the  front  end  of  the  old  shift  register,  whose  connection 
polynomial  is  c(m)(x),  in  order  to  line  up  Its  discrepancy  d^  to 
coincide  with  (and  cancel)  the  discrepancy  dn  produced  at  j  =  n  by 
the  current  shift  register,  with  connection  polynomial  c^n^(x). 


When  combined,  the  new  resulting  shift  register  will  have  length 
determined  by  the  maximum  of  the  length  Lm(s)  of  the  old  shift 
register  augmented  by  the  number  n  -  m  of  new  stages,  and  the  length 
Ln(s)  of  the  current  shift  register.  As  can  be  seen  directly  from 
figure  2,  the  input  Sj  to  the  combined  shift  register  is 
given  by 

Ln(s)  i  Lm(s)  ,  v 

sj  =  "  l  ci  sj-i  _  dpdm  (-sj-n+m  "  \  ci  sj-n+m-il* 

i=l  i=l 

Equation  (56)  provides  a  constructive  proof  of  Massey's  Theorem  2: 

THEOREM:  If  Lfj(s)  denotes  the  length  of  the  shortest  LFSR  which 
generates  s0,  . s^.i,  then 

(a)  if  some  LFSR  of  length  L^U)  which  generates  s0, 

Sn_i  also  generates  s0,  ...,  Sjj,  then 

Ln+i(s)  =  Ln(s); 

(b)  if  some  LFSR  of  length  Lfj(s)  which  generates  s0,  ..., 

SN-1  fails  to  generate  s0,  ...»  s^,  then 

L(j+i(s)  =  max(Lfc|(s),  N  +  1  -  L^(s)). 

We  observe  that  in  case  (b) 

if  L  ( s )  <  (n+l)/2,  then  L_ . .  ( s>  =  n  +  1  -  Ln(s); 


if  Ln ( s)  _>  (n+l)/2,  then  Lp+1 ( s )  *  Ln(s). 
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Program  6  is  a  representation  of  Massey's  version  of  the 
Berlekamp-Massey  algorithm.  Inputs  to  the  program  are  the  syndrome 
polynomial  S(x)  and  the  BCH  code  error-correction  capability  t.  The 
latter  is  used  at  line  4  of  the  recursion  in  the  test  for 
termination;  the  former  in  line  5  for  calculating  the  discrepancy 
between  the  jth  syndrome  and  the  j**1  symbol  output  by  the 
current  shift  register  with  connection  polynomial  bN(x). 

The  connection  polynomial  c(m)(x)  which  defines  the 
shift-register  before  the  last  length  change  is  represented  in 
normalized  form  by  bO(x).  This  polynomial  is  defined  at  line  9  of 
the  recursion  by  premultiplying  the  current  bN(x)  by  the  inverse 
of  the  nonzero  discrepancy  d.  (In  later  sections  we  shall  consider 
versions  of  the  algorithm  with  this  normalization  left  out.)  The 
polynomial  b°(x)  is  then  updated  in  each  iteration  at  line  1  of 
the  recursion  by  shifting  once,  corresponding  to  the  creation  of  a 
new  dummy  first  stage,  and  is  then  ready  for  use  in  defining  the  new 
shift  register  connection  polynomial  in  line  7  of  the  recursion. 

The  program  path  flow  is  slightly  more  complicated  than  for  the 
Euclidean  programs  4  and  5  as  a  result  of  the  length  test  and 
branching  at  line  8  and  the  discrepancy  test  and  branching  at  6. 
However,  both  tests  are.  In  a  sense,  implicit  In  the  Euclidean 
programs,  as  will  be  seen  in  section  7.3. 

At  termination,  the  error  locator  polynomial  a ( x)  Is  given  by 
bN(x).  The  error  evaluator  polynomial  n(x)  Is  not  Immediately 
available  In  Massey's  version,  and  must  be  calculated  by  the  key 
equation  (14).  However,  as  we  shall  see  In  the  next  section,  by  a 
slight  modification  of  program  6  we  can  also  obtain  q(x)  as  in  the 
Euclidean  programs. 
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f  -  0 

j  -  0 
b°(x)  -  1 
y(x)  -  1 


INITIALIZATION 


b°(x)  -  xb°(x) 
bN(x)  -  y(x) 
j  -  j  +  1 
j  :  2t 

i' 

d  -  £WSM 

i  =  0 

d  :  0 

y(x)  -  bN(x)  -  db°(x) 
j  :  2( 

b8(x)  -  d_1  bN(x) 

f  -  i  -  r 

RECURSION 


INPUT:  SYNDROME  POLYNOMIAL  S(x),  INTEGER  t 
OUTPUT:  a(x)  =  bN(x) 


♦►EXIT 


Program  6.  BERLEKAMP-MASSEY  ALGORITHM 


This  is  an  efficient  algorithm  in  terms  of  the  number  of 
multiplications  required.  Assuming  that  i  increases  by  1  at  every 
odd  numbered  iteration,  we  require  1  multiplication  to  compute  d  at 
the  first  iteration,  2  multiplications  at  the  second  and  third,  3  at 
the  fourth  and  fifth,  etc.,  t  -  1  at  the  2t  -  2n<*  and  2t  -  1st, 
and  t  at  the  2t^  iteration,  for  a  total  of  t2  multiplications  for 
computing  the  discrepancies.  Assuming  d  >  0  at  all  iterations,  we 
need  t2  +  t  multiplications  to  update  bN ( x )  at  line  7  of  the 
recursion  and  (t2  +  t)/2  to  update  bO(x)  at  line  9,  for  a  total  of 
the  order  of  2.5t2.  This  can  be  reduced  to  2t2  by  avoiding  the 
normalization  by  d-1  in  line  7,  as  will  be  discussed  in  section 
7.2. 

Program  6  has  the  drawback,  however,  that  the  updating  of  b(x) 
is  held  up  until  the  calculation  of  d  has  been  completed.  This 
drawback  is  removed  in  the  programs  considered  in  section  7,  but  at 
the  expense  of  requiring  further  multiplications.  Unlike  programs  4 
and  5,  program  6  cannot  be  executed  in  2t  basic  time  units,  and  is 
not  a  candidate  for  implementation  in  a  two-dimensional  systolic 
array.  At  each  iteration,  a  varying  additional  computation  time  is 
required  to  obtain  the  discrepancy  d. 

For  illustration  of  program  6,  we  use  the  same  example  of  a 
Reed-Sol omon  3-error-correcting  code  over  GF(ll)  as  used  previously 
for  programs  4  and  5. 

Example  10:  t  =  3;  Let  c(x)  =  0,  v(x)  =  e(x)  =  6x9  +  5x8  +  3x3. 

S(x)  =  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9. 
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SECTION  7 

HYBRIDS  AND  COMPARISONS 

In  this  section,  which  contains  the  main  results  of  this 
report,  comparisons  are  made  between  the  Berlekamp-Massey  algorithm 
and  Euclid's  algorithm  (in  the  Mills  version).  Hybrid  programs  are 
developed  which  combine  features  of  both  algorithms  to  advantage. 

The  section  is  divided  into  five  parts.  First,  the  Berlekamp-Massey 
algorithm  is  expanded  in  a  Euclidean  context.  Second,  a  new 
algorithm  developed  by  Todd  Citron  [22]  is  examined  and  shown  to 
belong  to  this  same  class.  Third,  Euclid's  algorithm  is  modified  to 
replace  polynomial  division  by  a  sequence  of  partial  divisions. 
Fourth,  Mills'  algorithm  is  modified  to  make  it  more  closely 
resemble  the  Berlekamp-Massey  algorithm.  Fifth,  comparisons  are 
made  among  the  resulting  hybrid  algorithms,  which  are  then  seen  to 
be  very  similar.  This  similarity  has  been  noted  previously  by  Welch 
and  Scholtz  [14],  as  well  as  others. 

7.1  THE  BERLEKAMP-MASSEY  ALGORITHM  IN  EUCLIDEAN  DRESS 

Our  objective  in  this  section  is  to  expand  the  Berlekamp-Massey 
algorithm  in  a  Euclidean  context.  Welch  and  Scholtz  [14]  have  noted 
a  correspondence  between  partial  results  obtained  for  bN ( x )  at 
certain  iterations  of  the  Berlekamp-Massey  algorithm  (program  6)  and 
partial  results  obtained  for  bN(x)  at  successive  iterations  of 
Mills  algorithm  (program  5).  To  explore  this  relationship  further, 
we  introduce  polynomials  a(x)  and  r(x)  for  the  Berlekamp-Massey 
algorithm  analogous  to  the  polynomials  a(x)  and  r(x)  of  Mills' 
algorithm. 
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Let  f(x)  and  g(x)  be  given  polynomials.  If  initial  values  for 
a®(x),  bfyx),  r°(x),  aN(x),  bN ( x ) ,  and  r^(x)  are  chosen 
to  satisfy 

r 0{ x)  =  aO(x)f(x)  +  bO(x)g(x)  (61 

rN(x)  =  aN(x)f(x)  +  bN(x)g(x)  (62 

then,  by  induction,  (61)  and  (62)  will  continue  to  be  satisfied  at 
all  iterations.  Let  ak(x),  bk(x),  and  rk{x)  denote  the  values 
of  a*J(x),  b^(x),  ana  r^fx)  defined  at  the  k^  iteration  of 
the  algor i thm.  Then  for  all  k 

rk(x)  =  a  k ( x ) f ( x )  +  bk(x)g(x)  (63 

for  the  Berl ekarnp-Massey  algorithm.  This  is  the  same  relationship 
that  holds  for  Euclid's  algorithm  (25),  even  though  the  recursions 
differ.  For  f(x)  and  g(x)  in  (63)  we  shall  choose 

f ( x )  =  -1 


g(x)  =  xS(x) 


converting  (63)  to 


rk(x)  =  -ak(x)  +  bk(x)xS(x),  k  =  1,  ...,  2t. 


Various  initializations  will  work  to  produce  the  result  (64). 

To  satisfy  (61)  we  set  rO(x)  =  1,  a^ix)  =  -1,  ana  t>0(x)  =  0. 

To  satisfy  (62)  we  choose  rw ( x )  =  xS(x),  a^(x)  =  0,  bN(x)  =  1. 

In  program  6,  t$(x)  and  bN(x)  are  updated  by  (59)  and 
(57).  It  follows  that  at  the  beginning  of  iteration  j,  with  a  shift 
register  of  length  t, 

l  <  j  =>  b^  =  0  for  all  i  >  St.  (65) 

In  a  similar  manner,  the  polynomials  aO(x)  and  aN ( x )  are 
updated  by  (58)  and  (60).  Again,  taking  into  consideration  the 
initial  values,  we  have  at  the  beginning  of  iteration  j 

£  <  j  =>  a^  =  0  for  all  i  >  t.  (66) 

At  the  kth  iteration,  let  hMx)  =  bMx)xS(x).  Then  since 

2t  2t 

xS(x)  =  J  SjxJ  =  l  SjxJ 
j=i  j=o 


if  Sg  is  defined  as  0, 


k  k 

l  bisk- 
i=0 


=  dk 
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and  by  (64) 


k  k  k  k 

rk  •  -ak  +  hk  -  0  +  < 


Therefore,  if  we  keep  r(x),  we  eliminate  the  inner-product 
computation  of  the  discrepancy  d. 


Program  7  is  a  representation  of  the  Berlekamp-Massey  algorithm 
in  a  Eucl ideanized  version.  We,  perhaps  wastefully,  introduce  three 
temporary  polynomials,  denoted  by  aJ(x),  bT(x),  and  rT(x),  for 
temporarily  holding  tne  new  versions  of  a(x),  b(x),  and  r(x)  created 


at  lines  11-13  of  tne  recursion.  Observe  that  at  each  iteration  j, 
rj  =  a.  At  line  17,  rj  is  set  to  1;  if  the  normalization  by  d-1 


were  not  performed,  its  value  would  be  d.  After  the  next 


incrementation  of  j,  r-j  is  still  equal  to  1;  if  normalization  were 
not  performed  at  line  17  it  would  still  have  the  value  d  =  dn.  At 
line  13,  r]  is  set  to  0.  At  termination,  /v(x)  is  given  by  bN(x) 

_ X  M  .  r-  j  _  i  o  ox  r _ x .* /  i  \ X  /  c  a  \  1,. 


and  ri  =  0  for  i  =  1,  2,  ...,  2t.  Equations  (14)  and  (64)  jointly 
imply  that 


o(x)  =  |(aN(x)  +  rN(x) )/x| x2t 


so  that  Q( x )  =  aN(x)/x  and  rN(x)/x  =  A(x)x2t.  Thus  the 
expanded  version  of  the  algorithm  provides  both  the  polynomials  o(x) 
and  Mx)  in  addition  to  the  error  locator  polynomial  \(x)  obtained 
in  Massey's  version. 


To  illustrate  program  7  we  again  use  the  Reed-Solomon 
3-error-correcting  code  over  GF ( 11 )  which  was  used  to  illustrate 


m 


as 


programs  4-6.  The  polynomials  rN(x),  r°(x),  etc.,  shown  are 
those  defined  prior  to  the  next  Incrementation  of  the  Index  j,  l.e 
after  execution  of  lines  1-6. 

Example  11:  t  =  3.  Let  c(x)  =  0,  v(x)  =  e{x)  *  6x9  +  5xe  +  3x3. 
S(x)  *  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9. 

j  -  0,  l  =  0 

rN(x)  =  xS(x)  =  10x6  +  7x5  +  9x4  +  8x3  +  2x2  +  9x 

r°(x)  =  x 

bN(x)  =  1 

b°(x)  =  0 

aN(x)  =  0 

a°(x)  =  -x  =  lOx 

j  =  1,  i  =  0,  d  *  r?  *  9,  d-1  =  5 

rN(x)  =  10x6  +  7xb  +  9x4  +  8x3  +  2x2 

r°(x)  =  6x7  +2x6  +  x5  +  7x4  +  10x3  +  x2 

bN(x)  =  1 

b°(x)  =  5x 

aN(x)  =  9x 

aO(x)  =  0 

J-2,  x-1.  d-rS-2 

rN(x)  *  10x7  +  6x6  +  5x5  +  6x4  +  10x3 

r°(x)  *  6x8  +  2x7  +  x6  +  7x5  +  10x4  +  x3 

b^(x)  *  x  +  1 

b°{x)  »  5x2 

aN(x)  =  9x 

a°(x)  *  0 


n(x)  *  aN(x)/x  =  3x2  +  3x  +  9 
X2tA(x)  =  rN(x)/x  =  x8  +  2x7 
A(x)  =  x2  +  2x 

The  price  paid  for  retaining  the  polynomials  a(x)  and  r(x)  in 
addition  to  b(x)  is  additional  multiplications.  If  all  coefficients 
of  rN(x)  and  r°(x)  are  retained,  then  to  update  rT(x)  at  the 
first  Iteration  takes  one  multiplication,  and  at  each  subsequent 
iteration  j  takes  2t  +  1  -  lj/2 J  for  a  total  of  3t2.  To  update 
r°(x)  requires  2t  -  |j/2j  multiplications  at  each  odd  iteration  j 
(assuming  i  is  incremented  at  each  odd  iteration)  for  a  total  of 
(3t2  +  t)/2  multiplications.  However  it  is  not  necessary  to  retain 
coefficients  above  those  for  x2t.  Dropping  these,  2t2  -  t  +  1 
multiplications  are  required  to  update  rT(x),  and  t2  +  t  to  update 
r°(x).  In  addition,  t2  +  t  multiplications  are  required  to  update 
bT(x),  t2  -  t  for  aT(x),  (t2  +  t)/2  for  bO(x),  and  (t2  -  t)/2 
for  a°(x),  giving  a  total  on  the  order  of  6t2. 

The  updates  of  all  polynomials  can  now  be  performed  in 
parallel,  however,  so  that  processing  time  Is  now  2t  basic  time 
units,  as  for  programs  4  and  5,  where  a  basic  time  unit  represents 
the  time  required  for  one  finite  field  multiplication  and  one 
subtraction  plus  (half  the  time)  one  finite  field  inversion. 
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7.2  CITRON’S  ALGORITHM 


Todd  Citron  has  presented  a  new  algorithm  [22],  based  on  Kung's 
generalized  Lanczos  recursion  [23,24]  and  related  to  Schur's 
algorithm  [25,26],  for  solving  the  key  equation  (14).  In  this 
section  we  shall  not  attempt  to  reproduce  Citron's  derivation,  but 
shall  simply  state  Citron's  algorithm  using  his  notation  and  show 
that  it  belongs  to  the  class  of  Euclideanlzed  Berlekamp-Massey 
algorithms  developed  in  section  7.1. 


Citron  describes  his  algorithm  using  matrix  notation  Instead  of 
employing  polynomials  over  finite  fields.  The  elements  of  his 
matrices  correspond  to  the  coefficients  of  our  polynomials,  as  will 
be  shown.  Citron  retains  three  two-columned  arrays:  R(1), 
corresponding  to  r(x),  P(1),  corresponding  to  b(x),  and  T(i), 
corresponding  to  a(x).  Citron's  algorithm  is  stated  as  follows: 


Recursion:  At  ith  iteration 


z_1R2( 1 ) 


Ri(1) 


z-1  0 


0  1 


Yi  ( 1-Y1 ) 


z-1R2(i-l) 


Rid-1) 


z-lP2(1) 


P  i  ( 1 ) 


z-1  0 


0  1 


Yi  (1- Y1 ) 


Z-1P2(1-1) 


P 1  ( 1-1 ) 


ivt2<i>i  pz-1  oi  n  -Pi  i  rz-iT2(i-i)i 


* 


WmS 


W*vN 


I 


*  */vr* 

-r. 


I 

m 
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where  z”1  denotes  a  left-shift  operation  (one  place). 

Pi  *  "top  element"  of  z^d  -  l)/"top  element"  of  Rx(1  -  1),  (71) 

Ni  =  «1-l  +  1 


Ni  =  j 

-Ni  if  Ni  >  0  and  Pi  *  0 

Ni  otherwise 

1  if  Ni  >  0  and  0 

and  Yi  = 

0  otherwise 

(72) 

Initialization: 

So  *  0 

Z-1R2(0)  = 

[Si  S2  ...  S2t]T 

(73) 

Ri(0)  = 

[1  0  ...  o]T 

z~1P2(0)  = 

i— 

i — i 
» — . 

o 

i 

• 

• 

• 

o 

o 

1 _ I 

(74) 

Pi(0)  = 

[0  ...  0]T 

z-xT2(0)  = 

[0  ...  0]T 

(75) 

TitO)  * 

[0  ...  0  -1]T 

At  termination: 

.-1 


z-AP2(2t)  *  U0  Ai  ...  A2t]T 
Z_1T2( 2t)  *  [Qg  Qj  ...  Q2tlT 


(76) 
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Let  us  first  apply  Citron  s  algorithm  to  our  Reed-Sol omon 
3-error-correcting  code  (over  GF(ll))  example.  Then  we  shall  show 
that  Citron's  algorithm  Is  a  slight  modification  of  the 
Euclldeanlzed  Berlekamp-Massey  algorithm  of  program  7.  For  the 
example,  we  display  Citron's  columns  as  rows.  By  (73)  we  need  2t 
places  for  R(1),  and  starting  from  (74)  and  (75)  we  need  2t  +  1 
places  for  P(1)  and  T(1),  since  2t  left  shifts  will  be  made  by  the 
algorithm  employing  (69)  and  (70). 

Example  12:  t  =  3;  c(x)  =  0,  v(x)  =  e(x)  =  6x9  +  5x8  +  3x3. 

S(x)  =  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9. 

2-1R2(0)  =  (9,  2,  8,  9,  7,  10) 

Rx (0)  =  (1,  0,  0,  0,  0,  0) 

1=1:  Pl  =  9/1  =  9 

NX  =  1 

»1  =  -1 

Yl  =  1 

Z-1R2(1)  =  (2,  8,  9,  7,  10,  0) 

R 1 ( 1 )  *  (9,  2,  8,  9,  7,  10) 

z-lP2(l)  =  (0,  0,  0,  0,  0,  1,  0) 

Pi(l)  =  (0,  0,  0,  0,  0,  0,  1) 

Z-1T2(1)  =  (0,  0,  0,  0,  0,  9,  0) 

Ti(l)  =  (0,  0,  0,  0,  0,  0,  0) 

1=2:  p2  =  2/9  =  10 

N2  =  0 

R2  ■  0 

Y2  b  0 
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z’1R2(2) 

M2) 

z-1P2(2) 


=  (10,  6,  5,  6,  10,  0) 
B  (9,  2,  8,  9,  7,  10) 


Pj(2)  = 

z-iT2(2)  = 
T  i  ( 2 )  = 


=  (0,  0,  0,  0,  1,  1,  0) 

=  (0,  0,  0,  0,  0,  0,  1) 


(o,  0,  0,  0,  9,  0,  0) 
(o,  0,  0,  0,  0,  0,  0) 


1  *  3: 


10/9  *  6 
1 


z-%(3)  = 
M3)  = 

z~1P2(3)  = 
P l ( 3)  = 

z-^sO)  * 
Tt(3)  * 


(5,  1,  7,  1,  6,  0) 
(10,  6,  5,  6,  10,  0) 

(0,  0,  0,  1,  1,  5,  0) 

(0,  0,  0,  0,  1,  1,  0) 

(0,  0,  0,  9,  0,  0,  0) 

(0,  0,  0,  0,  9,  0,  0) 

5/10  =  6 
0 


Z“1R2(4)  = 
Ri(4)  - 

z-iP2(4)  « 
Pi(4)  = 


Z"iT2(4)  = 
Ti(4)  * 


(9,  10,  9,  1,  0,  0) 
(10,  6,  5,  6,  10,  0) 

(0,  0,  1,  6,  10,  0,  0) 

(0,  0,  0,  0,  1,  1,  0) 

(0,  0,  9,  1,  0,  0,  0) 

(0,  0,  0,  0,  9,  0,  0) 


S'/'M 

i'siW. 1 
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5: 


P5  =  9/10  =  2 

N5  =  1 

K  =  -1 

Ys  =  1 

z-xR2(5)  =  (9,  10,  0,  2,  0,  0) 

R ! ( 5 )  =  (9,  10,  9,  1,  0,  0) 

Z-1P2(5)  =  (0,  1,  6,  8,  9,  0,  0) 

P 1 ( 5 )  =  (0,  0,  1,  6,  10,  0,  0) 

=  (0,  9,  1,  4,  0,  0,  0) 

T ! ( 5 )  =  (0,  0,  9,  1,  0,  0,  0) 

6:  p6  =9/9=1 

N6  =  0 

fi6  *  0 

Y6  »  0 

z-1R2(6)  =  (0,  2,  1,  0,  0,  0) 

Rj. ( 6 )  =  (9,  10,  9,  1,  0,  0) 

z-1P2(6)  =  (1,  5,  2,  10,  0,  0,  0)  =  {Ag,  ...»  Agt) 

Pl(6)  =  (0,  0,  1,  6,  10,  0,  0) 

Z’aT2{6)  =  (9,  3,  3,  0,  0,  0,  0)  =  (Q0 Q2t> 

Tl. ( 6 )  =  (0,  0,  9,  1,  0,  0,  0) 
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A  comparison  with  example  11  shows  that  the  components  of 
z_1R2(i),  z~ 1 P 2 ( i ) »  and  z-1T2(i),  are  the  same  as  the  coefficients 
of  rN(x),  bN(x),  and  a^x)  obtained  at  the  ith  iteration  of 
program  7. 

We  now  assert,  and  later  shall  show,  that  at  each  iteration  i. 
Citron's  o-j  *  0  if  and  only  if  Massey's  ith  discrepancy  d-j  t 
0.  With  this  assertion  we  show  that  Citron's  variable  Nf  = 
i  -  2?j_i,  where  ?.j_i  is  the  length  of  the  shift-register  at  the 
beginning  of  the  ith  iteration  of  the  Berlekamp-Massey  algorithm. 


Lemma : 

Ni (Citron)  =  i  -  2<’i_i 

(Massey) 

Proof: 

by  induction  on  i 

A) 

Let  i  =  1 

Citron: 

Massey 

N0  =  0 

>*> 

O 

tl 

O 

Ni  *  N0  +  1  =  1 

1  -  2t 

Therefore,  for  i  =  1,  N-j  =  i  -  2 Jl ^ _ i 
b)  Assume  lemma  is  true  for  i  =  n  -  1: 

Nn_i  =  n  -  1  -  2?n_2 

Case  1:  Nn_i  >  0  and  on_i  *  0  =>  25n_2  <  n-1  and  dn_i  *  q 


TT 
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Citron:  Massey:  length  change  required 

Rn-1  *  -Nn-1  *  -n  +  1  +  2?n-2  <n-l  =  n  -  1-  *n_2 

Kln  =  Kln_ i  +  1  =  -n  +  2  +  2in_2  n  -  2;.n_i  =  n  -  2(n-l-£n_2) 

=  -n  +  2  +  2An_2 

Case  2:  Nn_i  <  0  or  pn_i  =  0  =>  2£n_2  _>  n-1  or  dn_i  =  0 

Citron  Massey:  no  length  change 

Nn-1  =  Nn-1  ^n-1  =  *n-2 

Nn  =  fin-l+l  =  Nn-1+1  b-2*n-l  =  n-2Jln_2  =  Nn_!+1 

Therefore,  truth  of  the  lemma  for  i  =  n  -  1  implies  truth  for 
i  =  n. 

Therefore,  by  induction  on  i,  the  lemma  must  be  true  for  all  i. 

This  result  implies  that  for  all  iterations  i  Citron's 
definition  of  yj  is  equivalent  to 

1  if  o.  *  0  and  Za  <  i 

v1  =  (77) 

.  0  otherwi se 

where  t  is  Massey's  shift-register  length  (not  explicitly  computed 
in  Citron's  algorithm),  yj  is  a  logical  variable  employed  by 
Citron  to  eliminate  branching.  This  may  be  important  for  VLSI 
implementation  of  the  algorithm,  but  to  some  extent  hides  what  is 
going  on.  For  purposes  of  comparing  algorithms  we  leave  the 
branching  explicit  in  our  programs. 


The  first  matrix  on  the  rhs  of  (68)  produces 
[z_1R2(i ) , Ri ( i )]T  on  the  lhs  instead  of  [R2(i ),Ri(i)]T. 

Removing  this  matrix  multiplication  to  get  at  the  recursion  proper, 
we  have 

R2 ( i )  =  z-1R2(i-l)  -  PiRi(i-l)  (78) 

and 

z“ 1 R2 ( i -1 )  if  oj  *  0  and  2?  <  i 
Ri  (i )  =  (79) 

Ri(i-l)  otherwise 

Citron  shifts  the  arrays  R2(i),  P2 ( i ) ,  and  T2(i)  at  each 
iteration  one  place  to  the  left,  but  does  not  shift  Rr(i),  Pj(i), 
and  Ti(i).  On  the  other  hand,  in  program  7,  the  polynomial 
coefficients  for  r°(x),  b°(x),  and  a°(x)  are  shifted  once  to 
the  right  at  each  iteration  (lines  1-3  of  the  recursion)  while 
rN(x),  bN(x),  and  aN( x )  are  not  shifted.  Thus,  the  components 
of  R2(i),  etc.,  bear  the  same  relation  to  the  components  of  Ri(i), 
etc.,  as  the  coefficients  of  rN(x),  etc.,  to  the  coefficients  of 
r°(x),  etc.  (Note,  however,  that  at  the  ith  iteration,  the 
component  of  R2(i)  which  corresponds  to  r!jJ  is  now  at  the  "top"  of 
the  array.) 

In  order  to  compare  Citron's  algorithm  with  program  7  we  should 
like  to  remove  his  left  shift  of  R2(i)  and  insert  in  its  stead  a 
right  shift  of  Rj(i).  First,  let  us  multiply  relations  (78)  and 
(79)  by  the  right  shift  operator  z,  yielding 


zR2( i )  *  R2(i-1)  -  Pi2Ri{i-l) 


m 


zRi(i)  = 


R2 ( 1  - 1 )  if  Pi  *  0  and  Zl  <  i 


zRjd-l)  otherwise 


Relations  (80)  and  (81)  are  still  the  fundamental  equations  defining 
R(i)  in  Citron's  algorithm.  The  z  preceding  R2(i)  on  the  lhs  of 
equation  (80)  effects  the  left  shift  of  the  vector.  To  remove  the 
left  shift,  we  simply  remove  this  z,  producing  the  relation 

R2 ( i )  =  R2( i-1)  -  Pi  z  R ! ( i - 1 )  (82) 


To  induce  a  right  shift  of  the  vector  R^i)  we  remove  the  z 
preceding  R^i)  on  the  lhs  of  equation  (81),  yielding 


Rid)  = 


R2 ( i-1 )  if  pi  *  0  and  Zl  <  i 


zRi(i-l)  otherwise 


Equations  (82)  and  (83),  together  with  a  corresponding  set  of 
modifications  to  (69)  and  (70),  define  Citron's  algorithm  with  the 
shifts  changed  to  correspond  to  the  shift  in  the  Berlekamp-Massey 
algorithm. 

But  equations  (82)  and  (83)  are  essentially  the  same  equations 
as  those  found  in  the  Berlekamp-Massey  algorithm  of  program  7,  with 
R2(1)  corresponding  to  (and  Identical  to)  r^(x)  and  Rx(  1 ) 
corresponding  to  (but  not  Identical  to)  rO(x).  When  a  length 
change  is  required,  Citron  does  not  define  Rx ( 1 )  as  d^1 R2( i-1 ) , 
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as  on  line  17  of  program  7,  but  simply  as  R2(i-1).  Therefore,  when 
R2  is  updated,  it  must  be  updated  not  as  R2(i-1)  -  d^R^i-l),  but 


as  R2 ( i - 1 )  -  dpd-^zRx ( i -1 ) .  Thus,  in  Citron's  algorithm  is 


the  ratio  of  the  current  dn,  given  by  the  top  element  of  R2{i-1), 
divided  by  the  ol d  d,^,  given  by  the  top  element  of  Rx ( i - 1 ) ,  and 
Pi  *  0  if  and  only  if  dn  *  0,  as  asserted. 


Program  8  is  a  representation  of  the  shifted  version  of 
Citron's  algorithm  employing  (82)  and  (83)  with  branching  (at  lines 
10  and  14  of  the  recursion)  shown  explicitly.  We  now  repeat  our 
Reed-Sol omon  GF(ll)  example  for  program  8.  A  comparison  of  the 
polynomials  rN(x),  r0(x),  bN(x),  b0(x),  aN(x),  and  a^(x) 
with  Citron's  arrays  z_1R2(i),  Ri ( i ) ,  z_1P2(i),  Pj_ (1 ) ,  z‘1T2(i),  and 
T^i)  shows  them  to  be  identical  (except  for  shifts  and  directions). 


Example  13:  t  *  3;  c(x)  *  0,  v(x)  =  e(x)  =  6x9  +  5x8  +  3x3 
S(x)  =  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9. 


=  0,  *  =  0 


rN(x)  =  10x6  +  7xb  +  9x4  +  8x3  +  2x2  +  9x 

r°(x)  =  x 

bN(x)  =  1 

b°(x)  =  0 

aN(x)  =  0 

a°(x)  =  lOx 


1,  A  =  0:  d  =  r?/r?  =9/1=9 


rN(x) 

rO(x) 

bN(x) 

b°(x) 

aM(x) 


10x6  +  7x5  +  9x4  +  8x3  +  2x2 

10x7  +  7x6  +  9x5  +  8xH  +  2x3  +  9x2 

1 


,  i  =  1:  d  =  r‘i/r2  =  2/9  =  10 


rN{x)  =  10x7  +  6x6  +  5x5  +  6x4  +  10x3 
r°(x)  =  10x8  +  7x7  +  9x6  +  8x5  +  2x4  +  9x3 
bN(x)  =  x  +  1 


bO(x) 


aN(x)  =  9x 
a°(x)  =  0 


,1  =  1:  d  =  r-3/r?  =  10/9  =  6 


rN(x)  =  6x8  +  x7  +  7x6+  x5  +  5x4 

r°(x)  =  10x8  +  6x7  +  5x6  +  6x5  +  10x4 

bN(x)  =  5x2  +  x  +  1 

b°(x)  =  x2  +  x 

aN(x)  =  9x 

a°(x)  =  9x2 


,  l  *  2:  d  =  rlj/r2  =  5/10  =  6 


r*J(x) 

r°(x) 

bN(x) 

bO(x) 

aN(x) 

a°(x) 


x8  +  9x7  +  10x6  +  9x5 

10x9  +  6x8  +  5x7  +  6x6  +  10x5 


xJ  +  x 

.  o 


+  6x  +  1 

w2 


mm 

m 


m 

M 


i  =  5,A  =  2:d  =  rg/rg  =  9/10  =  2 


rN(x)  =  2x9  +  10x7  +  9x6 
r°{x)  *  x9  +  9x8  +  10x7  +  9x6 
b«(x)  =  9x3  +  8x2  +  6x  +  1 
bO(x)  =  10x3  +  6x2  +  x 
aN(x)  =  4x3  +  x2  +  9x 
aO(x)  =  x3  +  9x2 

i  =  6,  *  =  3:  d  =  r^/rg  =9/9=1 

rN(x)  =  x9  +  2x8 
rO(x)  =  x10  +  9x9  +  10x8  +  9x7 
dN(x)  =  10x3  +  2x2  +  5x  +  1 
bO(x)  =  10x4  +  6xJ  +  x2 
aN(x)  =  3x3  +  3x2  +  9x 
a&(x)  =  x4  +  9x3 

i  =  7,  stop. 

\(x)  =  bN(x)  =  10x3  +  2x2  +  5x  +  1 
Q(x)  =  a^(x)/x  =  3x2  +  3x  +  9 
x2tMx)  =  rN(x)/x  =  xR  +  2x7 
A ( x )  =  x2  +  2x 

Program  8  Is  more  efficient  than  program  7,  requiring  on  the  order 
of  4t2  multiplications  instead  of  6t2  for  correcting  t  errors.  The 
r  respecifications  of  rO(x),  aO(x),  and  b®( x)  in  lines  15-17  of 

*  the  recursive  section  have  been  simplified  by  deletion  of  the 

)  multiplication  by  d-1.  Like  program  7,  program  8  requires  2t  basic 

1 

> 

* 


time  units,  but  a  basic  time  unit  now  includes  a  finite  field 
division  at  every  iteration  (instead  of  a  finite  field  inversion  at 
alternate  iterations)  as  well  as  a  multiplication  and  a 
subtraction.  Any  way  you  look  at  it,  however,  this  is  still  the 
Berlekamp-Massey  algorithm.  Citron's  achievement  has  been,  not  to 
derive  a  new  decoding  algorithm,  but  to  show  an  equivalence  between 
the  Berlekamp-Massey  algorithm  and  Kung's  generalization  of  Lanczos' 
algorithm. 

7.3  INSIDE  EUCLID'S  ALGORITHM 

In  this  section  we  revisit  Euclid's  algorithm  for  polynomials 
(program  3)  in  order  to  take  apart  the  polynomial  division  defined 

!  in  the  first  step  of  the  recursion 

» 

) 

q(x)  «■  [rO(x)/rN(x)J  (84) 

To  keep  things  as  simple  as  possible,  we  shall  work  with  the 
original  version  of  Euclid's  algorithm  rather  than  with  the  extended 
version  which  obtains  a(x)  and  b(x).  We  wish  to  replace  the 
polynomial  division  of  (84)  by  a  sequence  of  k  +  1  partial  divisions 
where  k  is  an  integer  defined  by 

k  =  deg(r°(x))  -  deg(rN(x))  =  deg(q(x))  (85) 

Except  at  the  initial  iteration,  where  rO(x)  =  f ( x)  and  r^(x)  = 

(g(x)  may  result  in  k  =  0,  we  have  k  >  0  for  all  polynomial  divisions 
(84).  Usually  after  the  first  iteration  of  program  3,  though  not 
always,  q(x)  is  linear  so  that  k  =  1. 


In  performing  the  sequence  of  k  +  1  partial  divisions  we  do  not 
want  to  redefine  the  divisor  rN(x)  until  the  polynomial  division, 
i.e.,  the  k  +  1st  partial  division,  is  completed.  Until  that  time, 
the  new  remainder,  say  rT(x),  becomes  the  next  numerator  rO(x) 
and  the  divisor  rf4 ( x )  is  unaltered  (except  for  shifting  right  to 
line  up  appropriately  with  r^ ( x ) ) .  On  the  other  hand,  when  the 
polynomial  division  is  complete,  the  new  remainder  becomes  the  next 
divisor,  while  the  old  divisor  becomes  the  next  numerator.  Thus,  we 
have 

rNx)  *-  x-1rN(x) 

r°(x)  «-  r°(x)  -  qrN(x) 
and 

rN(x)  <-  r°(x)  -  qrN(x) 

’  if  completing  k  +  1st  (87) 

r0(x)  *  x-irN(x)  a  P°1ynonia'1  division 

where  q  is  a  scalar  defined  as  the  ratio  of  the  leading  coefficient 
of  r°(x)  to  the  leading  coefficient  of  r^tx). 

In  order  to  perform  the  subtractions  in  (86)  and  (87)  it  is 
necessary  to  align  the  leading  coefficients  of  the  polynomials 
rN(x)  and  rO(x).  For  convenience,  we  turn  the  polynomials 
arouna  and  align  the  trailing  coefficients.  In  this  way  we  can  deal 
with  the  coefficients  r|j  and  r^  at  the  jth  iteration  and  define 
q  as  r^/rj  if  rj  *  0.  Specifically,  we  shall  initialize 
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if  not  completing  k  times  (86) 
a  polynomial  division 


r^(x)  by  the  reversal  ?(x)  of  f{x).  To  properly  align  r^(x)  and 

rN(x)  as  f(x)  and  g(x)  are  initially  aligned,  we  initialize 

rN(x)  by  xdg(x),  where  d  =  deg(f(x))  -  deg(g(x))  is  assumed  to 

be  strictly  positive. 

Program  9  is  a  representation  of  Euclid's  algorithm  for 
polynomials  (in  the  unextended  version)  when  each  polynomial 
division  is  replaced  by  a  sequence  of  k  +  1  partial  divisions,  where 
k  is  defined  by  (85).  The  polynomials  r^(x)  and  r^(x)  have  been 
initialized  by  ?(x)  and  x^g(x),  respectively.  Correspondingly,  at 
termination,  the  gcd(?(x),  g( x ) )  is  given  by  r^(x),  so  that 

gcd(f(x),  g( x ) )  is  given  by  r°(x),  as  shown  in  section  3. 

Program  9  is  not  essentially  different  from  program  3,  but 
shows  explicitly  what  is  implied  by  (84).  The  recursion  is  divided 
into  two  loops,  the  left  one  for  completing  a  polynomial  division, 
and  the  right  loop  for  continuation  of  the  division.  Choice  of 
which  loop  to  follow  is  determined  by  the  integer  variable  i. 

Assume  that  as  we  complete  one  polynomial  division  and  initiate 
its  successor  we  have  j  =  Zi.  (Observe  that  j  can  never  be  less 
than  25,  for  5  is  incrememented  only  when  j  >  25.,  and  j  is 
incremented  at  the  same  time;  on  the  other  hand,  if  j  >  2J5,  we  stay 
in  the  continuation  loop.)  If  deg(r^(x))  =  deg(rN(x))  +  k 
in  program  3,  then  we  want  to  execute  the  right-hand  loop  of 
program  9  k  times,  followed  by  one  execution  of  the  completion 
loop.  Before  the  initial  incrementation  of  j  we  have  r§  =  0, 
rj+i  *  0,  and  r^+i  =  0  for  i  =  0,  ....  k-1  and  r^+k  *  0. 

After  the  initial  incrementation  of  j,  r^  *  0.  The  upper  part  of 
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COMPLETION  LOOP 


INPUT:  POLYNOMIALS  f(x),  g(x);  INTEGER  d  -  deg  (f(x)  -(Mg  (g(x»  >  0 
OUTPUT:  gcd  (f(x),g(x))  *  y?(x) 


Program  9.  EUCLID'S  ALGORITHM  WITHOUT  POLYNOMIAL  DIVISION 


the  completion  loop  is  first  executed  k  times;  each  iteration 

increments  j  and  shifts  r^(x)  once,  so  that  r9  is  unchanged  as  j 

►  >  J 

increases.  After  k  executions,  rj  *  0,  so  that  we  enter  the 
continuation  loop  with  j  =  2?.  +  k.  In  the  continuation  loop, 
rN(x)  is  shifted  once  at  each  iteration,  so  that  rjj  now  remains 
fixed  and  nonzero.  We  remain  in  the  continuation  loop  for  k 
executions,  incrementing  both  j  and  £,  after  which  we  return  to  the 
completion  loop  witn  j  =  2°..  The  final  remainder  r^{x)  becomes 
the  divisor  in  the  next  polynomial  division,  while  the  current 
divisor  r^(x)  becomes  the  numerator  for  the  next  polynomial 


division.  Termination  occurs  when  the  new  divisor  rM(x)  =  0  after 
completion  of  a  polynomial  division.  For  an  example  illustrating 
program  9  we  repeat  example  2  from  section  2. 


Example  14:  f(x) 
g(x) 
d 


x5  +  3x4  +  3x2  +  5x  +  10 
2x2  +  7x  +  3 
5-2*3 


over  GF (11) 


j  *  0,  i  -  0 


r^(x)  *  xT(x)  =  10x6  +  5x5  +  3x4  +  3x2  +  x 
rN(x)  <-  x3g(x)  =  3xs  +  7x4  +  2x3 


j  =  i,  i  =  0,  r?  =  0 


r0(x)  «-  xr^(x)  =  10x7  +  5x6  +3x6  +  3x3  +  x2 


j  =  2,  A  =  0,  r^  *  0 


r°(x)  ♦  xr0(x)  =  10x8  +  5x7  +  3xb  +  3x4  +  "3 


,  *  =  0,  r’f  =  2,  r?  =  1,  q  =  1/2  =  6 


rT{x)  «•  r0(x)  -  qrN(x)  =  10x8  +  5x7  +  3x6  +  4x5  +  5x4 
r^{x)  i-  rT(  x ) 

rN(x)  <•  xrN(x)  =  3x6  +  7x5  +  2x4 


,  A  -  1,  rj  -  2,  rj  -  5.  q  -  5/2  =  8 


rT(x)  <-  r°(x)  -  qrN(x)  =  10x8  +  5x7  +  xb  +  3x5 
rO(x)  -  rT(x) 

rN(x)  «-  xrN(x)  =  3x7  +  7x6  +  2x5 


,  s  =  2,  rjj  =  2,  r§  =  3,  q  =  3/2  =  7 


rT(x)  <-  r°(x)  -  qr,J(x)  =  10x8  +  6x7  +  7x6 
rO(x)  *  rT(x) 

rN(x)  «-  xrN(x)  =  3x8  +  7x7  +  2x6 


,  »  =  3,  rjj  =  2,  r§  =  7,  q  =  7/2  =  9 


rT(x)  *  r°(x)  -  qrN(x)  =  5x8  +  9x7 
r°(x)  «-  xrN(x)  =  3x9  +  7x8  +  2x7 
rN(x)  *  rT(x) 


,  i  =  3,  rj  =  9,  r§  =  2,  q  =  2/9  =  10 


rT(x)  «-  rO(x)  -  qrN(x)  =  3x9  +  x8 
r°(x)  «-  rT(x) 

rN(x)  <-  xrN(x)  =  5x9  +  9x8 


i'*.i 
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j  *  8,  I  5  4:  r*j  =  9,  rj  =  1,  q  =  1/9  =  5 

rT(x)  *  r°(x)  -  qrN(x)  =  0 
r^(x)  «-  xrM(x)  =  5x10  +  9x9 
r^(  x)  <-  r^(x)  =  0 

stop 

6»gcd(f(x),  g(x})  =  rO(x)  =  9x  +  5 

3=9,y=3_1-5 

gcd(f(x),  g( x ) )  =  x  +  3 

A  comparison  of  this  example  with  example  2  shows  that  the 
sequence  of  q's  defined  above  is  identical  to  the  successive 
coefficients  of  q(x)  obtained  in  the  earlier  example. 

Program  9  is  somewhat  awkward  because  two  different  loops  are 
followed  according  as  we  are  concluding  a  completion  loop  or  a 
continuation  loop.  Both  loops  contain  some  statements  in  common, 
namely,  lines  4-8  in  the  completion  box.  both  loops  define  a  new 
polynomial  rT(x)  at  line  7,  and  retain  this  new  polynomial 
together  with  one  of  the  pair  (rN(x),  rO(x)).  In  the 
continuation  loop  rT( x )  becomes  the  new  numerator,  while  the  other 
retained  polynomial  (shifted)  becomes  the  divisor;  in  the  completion 
loop  the  assignments  are  reversed:  r^{x)  becomes  the  new  divisor, 
while  the  other  retained  polynomial  (shifted)  becomes  the 
numerator.  Surprisingly,  all  that  really  matters  is  that  the 
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correct  pair  of  polynomials  be  retained.  It  does  not  matter  which 
polynomial  is  assigned  to  be  the  numerator  and  which  the  divisor. 

We  can  take  advantage  of  this  fact  to  design  an  improved  algorithm. 

Let  us  first  observe  that  it  is  permissible  to  multiply  either 
rO(x)  r^(x)  by  an  arbitrary  scalar  b.  For  if 

r0(x)  =  nr^'x) 


and 


RN(x)  =  vrN(x) 

for  son1'-  scalars  B  ana  y,  then 
Q  =  Rj/Rj  *  (S/y)q 
and  Rt(x)  =  R°(x)  -  QR^(x) 

=  3r°(x)  -  (B/y)qyrN(x) 

=  3rT( x) . 

Thus  the  only  effect  produced  by  multiplying  r®(x)  or  rN { x )  by  a 
scalar  is  to  multiply  future  r®( x)  and  r^(x)  by  some  scalar. 

Next,  consider  the  effect  of  swapping  roles.  If  the 
assignments  of  polynomials  to  rO(x)  and  r^(x)  are  reversed  at 
some  stage,  then  we  shall  calculate  a  new  RT(x),  say,  at  line  7  by 
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RT( x )  =  r^(x)  -  (rj/r^)r°(x) 


so  that 

-qRT(x)  =  -qrN(x)  +  r°(x)  =  rT(x) 
or 

Rt(x)  =  -q-1rT(x). 

Thus,  reversing  the  roles  of  numerator  and  divisor  has  no  effect  on 
the  algorithm  other  than  a  multiplication  of  the  result  by  a  scalar 
so  long  as  we  take  care  to  retain  the  correct  pair  of  polynomials  at 
each  iteration. 

We  are  now  in  a  position  to  eliminate  the  continuation  loop 
from  program  9,  producing  an  improved  version  of  Euclid's 
algorithm.  We  shall  still  replace  each  polynomial  aivision  by  a 
sequence  of  k  +  1  partial  divisions.  After  the  initial 
determination  of  rT(x)  (in  a  given  polynomial  division)  we  omit 
the  branch  to  the  continuation  loop,  define  r^(x)  by  the  old 
divisor  in  the  last  line,  and  let  r^(x)  become  the  divisor  at  line 
2,  thus  reversing  the  roles  played  by  r^(x)  and  rN(x). 

Having  once  defined  rfyx)  at  line  9  of  the  completion  loop, 
however,  we  must  not  redefine  it  (except  for  shifts  in  line  1)  until 
the  polynomial  division  is  completed.  This  we  can  ensure  by  adding 
the  statement 
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t  «-  j  -  i 

to  the  end  of  the  completion  box,  and  replacing  the  branch  from  the 
test 

j  :  21 

with  a  branch  to  the  first  line  whenever  the  relation  is  satisfied 
by  <_.  The  first  time  through  we  have  j  =  2°  +  k.  At  the  new  last 
line  of  the  box,  i  is  redefined  as  =  j  -  l  -  9.  +  k.  Thereafter, 

the  branch  test  will  succeed  for  the  next  k  iterations,  until 
j  =  2<l  +  2k  =  2 ,  when  the  polynomial  division  is  finally 

completed.  During  these  k  iterations  the  roles  of  r°(x)  and 

rN(x)  remain  reversed.  The  divisor  polynomials  of  program  9  are 
now  numerators,  and  scalar  multiples  of  the  numerators  of  program  9 
are  now  divisors. 

There  is  one  more  point  to  be  made.  During  the  k  iterations 
with  reversed  roles  rj  is  fixed  and  nonzero  (as  rj  was  fixed  and 
nonzero  ^  program  9).  However,  it  is  possible  that  at  some  one  of 

thes*  derations  rj  is  zero,  causing  a  branch  to  line  1  from 

line  b.  Th  s  is  more  efficient  than  the  longer  path  taken  in 
program  9,  where  q  is  defined  as  0,  rT(x)  as  r^( x) ,  followed  by 
a  branch  to  the  continuation  loop  which  redefines  r^(x)  as 
rT(x),  i.e.  as  itself.  The  result  is  the  same;  the  path  taken  is 
longer  in  program  9. 

Our  final  version  of  Euclid's  algorithm  is  given  by  program  10 
and  illustrated  by  example  15,  a  reworking  of  examples  2  and  14. 

Example  15:  f(x)  =  x5  +  3x4  +  3x2  +  5x  +  10) 

g(x)  =  2x2  +  7x  +  3  >  over  GF ( 11) 


C  -  o 


r°(x)  -  f(x) 


rT(x)  -  xdg(x) 


INITIALIZATION 


AJIMV?  1^"  HP.  JC"  JO  S ?IWV KTXT *n.  Pin  v.  ~  t  .-w 


r°(x)  -  xr°(x) 


r“(x)  -  rT(x) 


r*(x)  :  0 


j  -  j  +  1 


rT:  0 

q  -  rf/rj* 

rT(x)  -  r°(x)  -  qr"(x) 


j  : 


r°(x)  -  r"(x) 


i  -  i-e 


RECURSION 


INPUT:  POLYNOMIALS  f(x),  g(x);  INTEGER  d  =  deg  (f(x))  -  deg  (g(x»  >  0 
OUTPUT:  gcd  (f(x),  g(x))  =  y?{t) 


Program  10.  SIMPLIFIED  EUCLID'S  ALGORITHM 


||p|j 


555K555 


SRSEH 


fe 


> v.v 

'vW^ 


♦ 


vrv  put 


ii  TV  VM  ITWiNV  mi  *\iKV  wu  *~uarw«w«rw*w«»- 


rororCT groronr rororeypra* *  * v mv  v  k  uw  l  h  um 


j  *  7,  l  =  3:  rj  =  10,  r§  =  2,  q  =  2/10  *  9 

rT(x)  •*-  r0(x)  -  qrN(x)  =  3x9  +  x8 
r0{x)  4-  xrN(x)  =  8x9  +  10x8 
rN(x)  ♦  rT(x) 


j  =  8,  »  =  4:  r j  =  1,  rj  =  10,  q  =  10/1  =  10 

rT(x)  <-  r°(x)  -  qrw(x)  =  0 
r°(x)  xrO(x)  =  8x10  +  10x9 
r'^x)  ■*-  rT(x)  =  0 


stop 


p  •  gcci( f ( x ) ,  g(x))  =  r°{x)  =  lOx  +  8 
9  «  10,  Y  »  p-1  =  10 
gcd(f(x) ,g(x))  =  x  +  3 

Program  9  is  an  exact  translation  of  Euclid's  algorithm  for 
polynomials  when  polynomial  division  is  broken  down.  Program  10  is 
cleaner  and  more  efficient  than  program  9.  Program  10  closely 
parallels  Berlekamp's  decoding  algorithm  and,  in  effect,  shows  why 
the  Berlekamp-Massey  algorithm  is  more  efficient  than  decoding 
algorithms  based  directly  on  Euclid's  algorithm.  In  section  7.4  we 
adapt  the  Mills'  decoding  algorithm  of  program  5  to  reflect  the 
changes  of  programs  9  and  10.  The  resulting  decoding  algorithms  are 
then  shown  to  be  equivalent  to  the  Euclidean! zed  Berlekamp-Massey 
algorithm  of  program  8. 


7.4  MILLS'  ALGORITHM  IN  BERLEKAMP-MASSEY  DRESS 


In  this  section  the  Mills'  decoding  algorithm  of  program  5  is 
modified  in  two  stages.  First,  the  polynomial  division  is  replaced 
by  a  sequence  of  partial  divisions  as  in  program  9.  The  resulting 
algorithm  is  essentially  the  same  as  program  5,  but  is  free  of 


polynomial  divisions  and  can  test  for  termination  by  counting 
iterations.  However,  like  program  9  it  suffers  from  a  more 
complicated  control  structure  in  that  the  recursive  section  consists 
of  two  distinct  loops.  In  the  second  stage  we  eliminate  the 
continuation  loop,  producing  an  algorithm  analogous  to  program  10. 
This  version  of  Mills'  algorithm  closely  parallels  program  8  and 
might  be  viewed  as  its  Euclidean  reflection.  Finally,  we  show  that 
these  new  decoding  algorithms  are  equivalent  to  the  Berlekamp-Massey 
algorithm  of  program  8. 

An  initial  change  which  we  make  in  program  5  in  order  to 
conform  to  the  initializations  of  programs  7  and  8  is  to  reverse  the 
signs  of  the  initial  values  of  rO(x)  and  a^( x ) .  In  program  5, 
this  would  have  the  effect  of  reversing  the  sign  of  q(x)  at  each 
iteration  and  of  r(x),  a(x),  and  b(x)  at  each  odd-numbered 
iteration.  Since  at  termination  \ ( x )  is  obtained  as  some  scalar 
multiple  of  bN(x),  and  n(x)  as  the  same  multiple  of  aN(x),  this 
sign  reversal  may  change  the  scalar  but  does  not  affect  the 
determination  of  A. ( x )  and  q(x),  nor  of  the  error  magnitudes. 

As  in  programs  9  and  10,  it  is  convenient  to  be  able  to  define 
q  at  the  jth  iteration  as  rj/rj  instead  of  as  the  ratio  of  the 
leading  coefficients.  To  achieve  this,  we  initialize  rN(x)  by 
xS(x)  and  rO(x)  by  1,  as  in  programs  7  and  8,  rather  than  by  5(x) 
and  -x^t,  as  in  program  5.  Initialization  of  rN(x)  by  xS(x) 
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i  -  o 

j  -  0 

r°(x)  -  1 
rT(x)  -  xS(x) 
a°(x)  -  -1 
aT(x)  -  0 
b°(x)  -  0 
bT(x)  -  1 


INITIALIZATION 


b°(x)  -  xb°(x) 
a°(x)  -  xa°(x) 
r°(x)  -  xr°(x) 
bN(x)  -  bT(x) 
aN(x)  -  aT(x) 
r*(x)  -  rT(x) 

j  -  j  +  * 

j  :  2t 
rf  :  0 
q  -  rf/if 

bT(x)  -  b°(x)  -  qbN(x) 
aT(x)  -  a°(x)  -  qaN(x) 
rT(x)  -  r°(x)  -  qr**(x) 
j  :  2f 
b°(x)  -  bN(x) 
a°(x)  -  aN(x) 
r°(x)  -  r"(x) 


b°(x)  - 

bT(x) 

8°(X)  - 

aT(x) 

r°(x)  - 

rT(x) 

bN(x)  - 

xbN(x) 

aN(x)  - 

xaN(x) 

^(x)  - 

xr*(x) 

f  - 

t  +  1 

CONTINUATION  LOOP 


COMPLETION  LOOP 


INPUT:  SYNDROME  POLYNOMIAL  S(x),  INTEGER  t 
OUTPUT:  >  a(x)  =  bN(x),  >  l)(x)  =  aN(x)/x 


Program  11.  MILLS'  ALGORITHM  WITH  PARTIAL  DIVISIONS 
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means  that  at  iteration  1,  rf  will  be  Si;  initialization  of  rO(x) 
by  1  means  that  at  iteration  1,  after  a  right  shift  of  rtyx),  r? 
will  be  1,  and  the  initial  q  is  defined  as  1/Si ,  as  required,  1  and 
Si  being  the  leading  coefficients  of  x^t  and  5(x). 

Program  11  is  a  representation  of  Mills'  algorithm  with  these 
initialization  changes  when  the  polynomial  division  is  broken  down 
into  its  partial  divisions  (i.e.,  program  11  is  the  Mills'  decoder 
analog  of  program  9).  This  is  not  a  different  algorithm  from  that 
represented  in  program  5,  but  explicitly  shows  what  is  implied  by 
the  first  statement  in  the  recursion  of  program  5 

q ( x )  <-  Lr°(x)/rN(x)J . 

The  recursion  in  program  11  is  divided  into  two  loops,  the  left  one 
for  completing  a  polynomial  division,  and  the  right  loop  for 
continuation  of  the  division  as  in  program  9.  Choice  of  which  loop 
to  follow  is  determined  by  the  integer  variable  a. 

Termination  of  the  program  can  now  be  decided  by  counting 
iterations  j  and  stopping  if  j  exceeds  2t.  For,  if  in  program  5 
deg( rN( x ) )  <  t  then  in  program  11  rj  =  0  and  the  program  makes 
no  further  changes  except  to  increment  j.  Suppose  2t  iterations  do 
not  suffice.  Each  polynomial  division  with  k  =  deg(q(x))  requires 
2k  iterations  (k  shifts  of  r°(x)  followed  by  k  trips  through  the 
continuation  loop).  Thus,  if  n  polynomial  divisions  are  required, 
and  ki  denotes  the  degree  of  the  1th  quotient  polnomlal  as  defined 
by  (85)  for  the  remainder  polynomials  of  program  5,  then 
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2  V  k.  =  2t  +  2s 
i=l  1 


where  s  >  0  if  2t  iterations  do  not  suffice. 


But 


T  k.=  deg(r_1(x))  -  deg(rn_1(x)) 
i=l  1 


=  2t  -  deg(rn“^(x)) 


Therefore, 

deg(rn”1(x))  =  2t  -  (t  +  s) 

=  t  -  s  <  t 

leading  to  a  contradiction.  Therefore,  2t  iterations  suffice  and 
program  11  need  not  test  for  the  degree  of  rN ( x ) . 

The  treatment  of  r(x)  is  slightly  different  from  that  of 
program  5.  We  need  to  keep  only  2t  terms,  and  at  each  iteration  j 
we  set  rj  *  0,  leaving  only  2t  -  j  coefficients  to  be  multiplied 
at  the  next  update.  We  thus  require  only  2t2  +  t  multiplications 
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for  updating  r^(x),  instead  of  the  3t2  +  t  implied  by  prog  am  5. 

In  program  5,  the  number  of  multiplications  could  also  be  reduced 
from  3t2  +  t  to  2t2  +  t  by  recognizing  that  terms  in  r ( x)  beyond 
2t  -  j  need  not  be  retained  after  iteration  j.  However,  this  same 
reduction  cannot  be  applied  in  program  4  without  losing  q(x)  in  the 
process.  2t  basic  time  units  are  required  in  program  11  to  correct 
t  errors,  where  a  basic  time  unit  consists  of  the  time  required  for 
one  finite  field  division,  one  multiplication,  and  one  subtraction. 

We  now  repeat  our  Reed-Solomon  3-error  correcting  code  example 
for  program  11. 

Example  16:  Reed-Solomon  3-error-correcting  code  over  GF(ll)  with 

a  =  2. 

t  *  3;  let  c ( x )  =  0,  v(x)  *  e(x)  =  6x9  +  5x8  +  3x3. 

S(x)  =  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9 

j  =  0,  o  =  0: 

rN(  x )  =  xS(x)  =  10x6  +  7x5  +  9x4  +  8x3  +  2x2  +  9x 

r°(x)  =  x  (after  shift) 

bN(x)  =  1 

bO(x)  =  0 

aN(x)  =  0 

a°(x)  =  -x  =  lOx 

j  *  1,  t  5  0:  q  =  r^/ri  =  1/9  =  5;  execute  continuation  loop 

rN(x)  «•  xrN(x)  =  10x7  +  7x6  +  9x5  +  8x4  +  2x3  +  9x2 

rO(x)  *  rO(x)  -  qr^(x)  =  5x6  +  9x5  +  10x4  +  4x3  +  x2 

bN ( x )  ♦  xbN(x)  =  x 

bO(x)  «-  bO(x)  -  qb^(x)  =  6 

aN ( x )  «-  xa^( x)  =  0 

a°(x)  <-  a^(x)  -  qaN(x)  =  lOx 
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j  =  2,  A  =  1:  q  =  r^/r"  =  1/9  =  5;  execute  completion  loop 

rN( x )  <-  rO(x)  -  qrN(x)  =  5x7  +  3x6  +  8x5  +  3x4  +  5x3 

r^(x)  ♦  xr^(x)  =  10x8  +  7x7  +  9x6  +  8x5  +  2x4  +  9x3 

bN(x)  b^(x)  -  qbN ( x )  =  6x  +  6 

b°(x)  •«-  xbN(x)  =  x2 

aN ( x )  4-  a°(x)  -  qaN ( x )  =  lOx 

a^(x)  4-  xa^(x)  =  0 

0  M 

j  =  3,  0  =  1:  q  =  r*3 /rj  =  9/5  =  4;  exceute  continuation  loop 


rN(  x) 
rO(x) 
b«(x) 
bO(x) 
aN(  x) 
aO(x) 


5x3  +  3x7  +  8x6  +  3x5  +  5x4 

10°  +  9x7  +  8x6  +  9x5  +  x4 

6x2  +  6x 

x2  +  9x  +  9 

10x2 

4x 


=  4,  a  =  2:  q  =  r^/rlj  =  1/5  *  9;  execute  completion  loop 


r*(x) 

rO(x) 

bN(x) 

bO(x) 

aN(x) 

aO(x) 


=  9x8  +  4x;  +  2xb  +  4x5 

=  5x9  +  3x8  +  8x7  +  3x6  +  5x5 

=  2x2  +  lOx  +  9 
=  6x3  +  6x2 

=  9x2  +  4x 


=  5,  9  =  2:  q  s  Tyr*  *  5/4  *  4;  execute  continuation  loop 


rN(x) 
rO(x) 
bN(  x ) 
bO(x) 
aN  ( x ) 
aO(x) 


9x9  +  4x8  +  2x7  +  4x6 
5x9  +  3x7  +  6x6 
2x3  +  10x2  +  9x 
6x3  +  9x2  +  4x  +  8 
9X3  +  4x2 
10x3  +  8x2  +  6x 


0  N 

j  =  6,  JL  =  3:  q  =  r£/r6  =  6/4  =  7;  execute  completion  loop 

rN(x)  =  8x9  +  5x6 
r°{x)  =  9xlu  +  4x9  +  2x8  +  4x7 
bN{x)  =  3x3  +  5x2  +  7x  +  8 
bO(x)  =  2x4  +  10x3  +  9x2 
aN{x)  =  2x3  +  2x2  +  6x 
a°(x)  =  9x4  +  4x3 

j  =  7,  stop:  y  =  8,  y"1  =  7 

\(x)  =  y-1rN(x)  =  10x3  +  2x2  +  5x  +  1 
°(x)  =  y-1aN(x)/x  =  3x2  +  3x  +  9 
x2t'(x)  =  v_1rN(x)/x  =  xB  +  2x7 
Mx)  =  x2  +  2x 

The  computational  flow  of  program  11  is  awkward,  like  that  of 
program  9.  A  more  efficient  algorithm  can  be  obtained  by 
incorporating  the  changes  of  program  10  into  Mills'  decoding 
algorithm.  Three  alterations  are  made  to  program  11: 

(1)  Eliminate  the  right-hand  loop,  returning  the  arrow  to  the 
top  of  the  left-hand  loop. 

(2)  Reverse  the  condition  for  taking  the  branch  from  ">M  to 


(3)  Add  the  specification  statement 
»  -  j  -  ? 

to  the  bottom  of  the  left-hand  loop. 
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These  changes  result  in  program  12,  which  is  seen  to  be  very  similar 
to  program  8.  Like  programs  8  and  9,  program  12  requires  2t2  +  t 
multiplications  for  updating  rT(x),  and  a  total  on  the  order  of 
4t2  multiplications  for  finding  \(x)  when  t  errors  have  occurred. 
Like  programs  8  and  9,  program  12  requires  2t  basic  time  units  where 
a  basic  time  unit  includes  the  time  to  perform  one  finite  field 
division,  one  multiplication,  and  one  subtraction. 

We  now  repeat  example  16  for  program  12. 

Example  17:  Reed-Sol omon  3-error-correcting  code  over  GF(ll)  with 

a  =  2. 

t  =  3;  let  c ( x )  =  0,  v(x)  =  e(x)  =  6x9  +  5x8  +  3x3. 

S(x)  =  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9 

j  =  0,  «.  =  0: 

rN(x)  =  xS(x)  =  10x6  +  7x5  +  9x4  +  8x3  +  2x2  +  9x 

r°(x)  =  x 

bN ( x )  =  1 

bO(x)  =  0 

aN(x)  =  0 

a°(x)  =  -x  =  lOx 

j  =  1,  A  *  0:  q  =  r?/r?  =1/9=5 

rN(x)  =  5x6  +  9x5  +  10x4  +  4x3  +  x2 

rO(x)  =  10x7  +  7x6  +  9x5  +  8x4  +  2x3  +  9x2 

bN( x)  =  6 

bO(x)  =  x 

aN ( x )  =  10 

a®(x)  =  0 
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j  =  2,  A  =  1:  q  =  r§/r^  =9/1=9 

rN(x)  =  10x7  +  6x6  +  5x5  +  6x4  +  10x3 

r°(x)  =  10x8  +  7x7  +  9x6  +  8x5  +  2x4  +  9x3 

bN(x)  =  x  +  1 

b°(x)  =  x2 

aN(x)  =  9x 

a°(x)  =  0 

j  =  3,  9  =  1:  q  *  r§/r^  =  9/10  =  2; 

rN(x)  =  10x8  +  9x7  +  8xto  +  9x5  +  x4 

rO(x)  =  108  +  6x7  +  5x6  +  6x5  +  10x4 

bN(x)  =  x2  +  9x  +  9 

bO(x)  =  x2  +  x 

aN(x)  =  4x 

a°(x)  =  9x2 

j  =  4,  Z  =  2:  q  =  rj/rj  =  10/1  =  10 

r^(x)  =  9x8  +  4x7  +  2x6  +  4x5 

r°(x)  =  10x9  +  6x8  +  5x7  +  6x6  +  10xb 

bN(x)  =  2x2  +  lOx  +  9 

bO(x)  =  x3  +  x2 

aN(x)  =  9x2  +  4x 

a°(x)  =  9x3 

j  =  5,  i  =  2:  q  =  r^/rs  =  10/4  =  8 

rN( x)  =  10x9  +  Ox8  +  6x7  +  x6 
r°(x)  =  9x9  +  4x8  +  2x7  +  4x6 
bN( x )  =  x3  +  7x2  +  8x  +  5 
bO(x)  =  2x3  +  10x2  +  9x 
aN( x )  =  9x3  +  5x2  +  x 
aO{x)  =  9x3  +  4x2 


MWWlJnU ?r  TOIgTOTOTW  vw^jvwjvwvv;yvr;  yv 


j  =  6,  A  =  3:  q  =  r?/rj  =4/1=4 

rN(x)  =  2x9  +  4x8 
rO(x)  =  9x10  +  4x9  +  2x8  +  4x7 
bN(x)  =  9x3  +  4x2  +  lOx  +  2 
bO(x)  =  2x4  +  10x3  +  9x2 
aN(x)  =  6x3  +  6x2  +  7x 
a°(x)  =  9x4  +  4x3 

j  =  7,  stop:  y  =  2,  v-1  =  6 

Mx)  =  Y_1r^(x)  =  10x3  +  2x2  +  5x  +  1 
°(x)  =  v_1aN(x)/x  =  3x2  +  3x  +  9 
x2tR(x)  =  v_1rw(x)/x  =  x8  +  2x7 
Mx)  =  x2  +  2x 

Transformation  of  program  11  into  program  12  is  justified  by 
the  same  arguments  used  for  validatng  the  transformation  of  program 
9  into  program  10.  We  now  demonstrate  that  at  each  iteration  j  the 
polynomials  aJ(x),  b-i(x),  and  rJ(x)  produced  by  program  12 
differ  from  those  produced  by  program  8  only  by  a  scale  factor.  To 
distinguish  between  the  polynomials  produced  by  the  two  algorithms 
we  shall  use  lower  case  r{x),  etc.,  for  program  12  (Mills)  and  upper 
case  R(x),  etc.,  for  program  8  (Berlekamp-Massey) .  We  assume  that 
at  iteration  j  at  the  instant  when  the  counter  j  is  incremented  the 
polynomials  r^(x)  and  rN(x)  are  related  to  R^(x)  and  RN(x) 
by 


r0(x)  =  RRO(x) 
rN(x)  =  yRn(x) 
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where  R  and  v  are  scalars  (field  elements),  and  show  that  if 

rT(x)  =  -qy Rt( x ) 

where 

q  =  r9/r^  =  (R/yld-1 

where  d  is  the  quantity  computed  at  line  9  in  the  recursion  section 
of  program  8.  We  have 

Rt(x)  =  rN(x)  -  dR°(x) 

or 

-d-1  Rt( x)  =  R°(x)  -  d-^Nlx). 

Therefore, 

-(v/R)qRT(x)  =  c?~ 1  r°( x )  -  (v/Rjqv^rNfx). 

Multiplying  by  R,  we  find  that 

-qyRT(x)  =  r°(x)  -  q rN ( x )  *  rT(x). 

Thus,  each  time  rj  *  0,  the  ratios  rN(x)/RN(x)  are  multiplied 
by  -q  to  obtain  rT(x)/R^(x);  when  rjj  *  0,  Rj  =  0  so  that  q 


is  undefined  and  r^{x)  and  R^(x)  are  unchanged.  For  j  =  0, 

Ki  k| 

rj  =  Rj  .so  that  at  iteration  j 

rT(x)  =  n  ( -q-j ) RT( x) .  (88) 

i  =  l 

qi  defined 

The  same  relationship  holds  between  a^(x)  and  A^(x)  and  between 
bT(x)  and  BT(x).  Observe  that  qj  is  never  zero  in  program  12, 

because  r°(x)  is  initialized  as  1,  r^(x)  is  redefined  as  rN(x) 

NO  N 

only  when  rj  *  0,  and  rj  does  not  change  when  rj  =  0.  Since 

program  8  solves  the  key  equation,  it  follows  that  program  12  must 

solve  the  key  equation  (14)  for  \{x)  and  n(x).  Again,  we  do  not 

care  if  the  polynomials  differ  from  those  produced  by  program  8  by  a 

scale  factor,  for  neither  the  Chien  search  nor  application  of 

Forney's  formula  (16)  is  thereby  affected. 

Let  j 

Mi  =  H  (-qi) 

J  1-1 

qi  defined 

in  expression  (88).  Table  1  shows  the  values  of  Mj,  r^(x),  and 
Rn(x)  obtained  in  a  comparison  of  examples  17  and  13.  From  this 
table  it  is  easily  seen  that  equation  (88)  is  satisfied  by  the 
outputs  for  rN( x )  and  RN(x)  from  programs  8  and  12  for  this 
example.  Similar  tables  can  be  constructed  for  aN(x)  and  bN(x). 
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1  5  5x°+9x2+10x‘4+4x3+x^ 

2  9  10x7+6xb+5x5+6x4+10x3 

3  2  10xc+9x7+8x6+9xb+x4 

4  10  9x8+4x7+2x6+4xb 

5  8  10x9+6x7+x6 

6  4  2x9+4x8 


6  lOx  °+7  x  b+9x  4+8x  3+2x  2 

1  10x7+6x6+5x5+6x**+10x3 

9  6x8+x7+7x6+x5+5x4 

9  x8+9x7+10x6+9x5 

5  2x9+10x7+9x6 

2  x9+2x8 


We  now  go  back  and  show  that  at  each  iteration  j  the 
polynomials  a J ( x ) ,  b j ( x ) ,  and  rJ(x)  produced  by  program  11 
differ  from  those  produced  by  program  12,  and  hence  from  those 
produced  by  program  8,  only  by  a  scale  factor.  Since  program  11  is 
equivalent  to  program  5,  this  demonstrates  the  equivalence  of  Mills' 
algorithm  and  Berlekamp’s  algorithm. 

To  distinguish  polynomials  produced  by  program  11  from  those 
produced  by  program  12  we  shall  use  lower  case  r(x),  etc.,  for 
program  11  and  upper  case  R(x),  etc.,  for  program  12.  We  now  have 
two  separate  cases  to  treat,  according  as  we  have  just  executed  the 
completion  loop  or  the  continuation  loop  in  program  11.  In  the 
former  case  the  branch  at  line  14  of  the  recursion  of  program  12  is 
not  taken;  in  the  latter  case  the  branch  is  always  taken. 

Case  1:  Completion  Loop 

We  assume 


rO(x)  =  BRO(x) 


rN(  x )  =  y  RN  ( x ) 
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for  some  field  elements  R  and  y,  and  show  that 

rT(x)  =  6RT(x) 

for  some  field  element  6,  namely  3. 

We  have 

q  =  rj/rj  =  QrJ/yRj  =  (3/y)Q 

rT(x)  =  r°(x)  -  qrN( x) 

=  RRO(x)  -  (3/y)QyRn(x) 

=  RRt(x)  (89) 

so  that  *  =  R ,  as  claimed.  At  the  conclusion  of  the  iteration,  we 
then  have 

r°(x)  (new)  =  rT(x)  =  brT(x)  *  RRM(x)(new) 
and 

r^(x)(new)  *  xrN(x)(ola)  =  yxRN(x)(old)  =  yR^xHnew), 

where  r°(x)(new)  denotes  the  value  of  rO(x)  just  prior  to  the 
next  incrementation  of  the  counter  j,  rN(x)  (old)  denotes  the 
value  of  rN(x)  at  the  last  incrementation  of  the  counter  j,  etc. 
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r^(x)(new)  =  xrN(x)(old)  =  vxR°(x)(old)  =  yR°(x)(new), 


satisfying  the  conditions  assumed  for  conclusion  of  a  continuation 
loop. 

Subcase  2b:  in  the  completion  loop 


We  have 

r°(x)(new)  =  xr^(x)(old)  =  yxR0(x)(old)  =  yR0(x)(new) 
r^( x ) ( new)  =  rT(x)  =  -(^/Q)RT(x)  =  6RN(x) 

satisfying  the  conditions  assumed  for  conclusion  of  a  completion 
loop. 


Since  initially  the  programs  begin  with 

rO(x)  =  R°(x) 
and 

rN(x)  =  Rn(x) 

the  assumptions  for  both  case  1  and  case  2  are  always  met.  Thus,  at 
every  iteration  j  the  polynomials  aJ(x),  bJ(x),  and  rJ(x) 
produced  by  program  11  differ  from  those  produced  by  program  12  only 
by  a  scale  factor  (though  of  courso  the  roles  of  r^(x)  and 
rN(x),  etc.,  are  sometimes  reversed). 


We  have  now  shown  that  program  5  (Mills'  algorithm)  is 
equivalent  to  program  11,  which  is  equivalent  to  program  12,  which 
is  equivalent  to  program  8,  which  in  turn  is  equivalent  to  program 
7,  which  is  an  expanded  version  of  program  6  (Berlekamp's 
algorithm) . 

We  conclude  that  Mills'  algorithm  and  the  Berlekamp-Massey 
algorithm  may  be  viewed  as  variants  of  Euclid's  algorithm  which  are 
equivalent  in  the  sense  that  partial  results  produced  by  one 
algorithm  can  be  mapped  directly  into  partial  results  produced  by 
the  other. 

To  complete  this  section  we  display  table  2  showing  the 
polynomials  rT(x)  and  R^(x)  defined  at  each  iteration  j  for 
examples  16  and  17  using  programs  11  and  12,  respectively.  Also 
shown  are  the  values  of  the  polynomials  rfyx),  rN(x),  R°(x), 
and  Rw(x)  at  the  beginning  of  the  iteration  ( i . e . ,  these  are  the 
values  determined  for  these  polynomials  during  the  j-lst 
iteration).  From  the  table  it  is  readily  observed  that  the 
relations  (89)  and  (90)  hold  for  rT(x)  and  RT(x)  determined  at 
iterations  following  execution  of  a  completion  loop  and  following 
execution  of  the  continuation  loop,  respectively. 
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Table  2.  A  Comparison  of  Outputs  from  Programs  9  (r'(x)) 

and  10  (Rt(x)) 


Program  11 


r°(x) 


rN(x) 


rT(x) 


10x6+7x5+9x4+8x3+2x2+9x 


5x6+9x5+10x4+4x3+x2 


5x6+9x5+10x4+4x3+x2 
10x7+7x6+9x5+8x4+2x3+9x2 
5x  7+3x  6+8x  5+3x  4+5x  3 


10x8+7x7+9x6+8x5+2x4+9x3 

5x7+3x6+8x5+3x4+5x3 

10x8+9x7+8x6+9x5+x4 


10x8+9x7+8x6+9x5+x4 

5x8+3x7+8x6+3x5+5x4 

9x8+4x7+2x6+4x5 


5x9+3xs+8x7+3x6+5xb 


9x8+4x7+2x6+4x5 

5x9+0x8+3x7+bx6 


5x9+0x8+3x7+bx6 

9x9+4x8+2x7+4x6 

8x9+5x8 


Program  12 


R°(x) 


RN(x) 


RT(  x) 


10x6+7x5+9x4+8x3+2x2+9x 

5x6+9x5+10x4+4x3+x2 


IOx  7+7x  6+9x  5+8x  4+2x  3+9x  2 

5x6+9x5+10x4+4x3+x2 

10x7+6x6+5x5+6x4+10x3 


IOx 8+7x 7+9x  6+8x 5+2x  4+9x  3 

10x7+6x6+5x5+6x4+10x3 

10x8+9x7+8x6+9xb+x4 


IOx  8+6x  7+5x  6+6x  5+10x  4 

10x8+9x7+8x6+9x5+x4 

9x8+4x7+2x6+4xb 


10x9+6x8+5x7+6x6+10x5 


9x8+4x7+2x6+4x5 


10x9+0x8+bx7+xb 


9x9+4x8+2x7+4x6 

10x9+0x8+6x7+x6 

2x9+4x8 
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7.5  COMPARISONS 


In  this  section  some  comparisons  are  made  among  the  various 
algorithms  which  have  been  treated  so  far.  We  first  compare, 
briefly,  the  versions  of  the  Berlexamp-Massey  algorithm  that  have 
been  discussed;  second,  we  make  comparisons  among  the  different 
versions  of  the  Euclidean  algorithm;  finally,  we  make  comparisons 
between  the  two  classes  and  consider  whether  there  is  any  choice  to 
be  made  between  programs  8  and  12. 

The  three  versions  of  the  Berlekamp-Massey  algorithm  which  have 
been  discussed  are  represented  by  programs  6,7,  and  8.  All  three 
programs  solve  the  key  equation  (14)  for  \(x);  programs  7  and  8  also 
provide  n( x )  at  the  cost  of  more  multiplications  and  storage  for 
a(x).  Program  6  requires  computation  of  the  discrepancy  d  at  every 
iteration  by  a  vector  inner  product  calculation  whose  length  grows 
at  each  iteration.  This  would  be  highly  undesirable  if  the 
algorithm  were  to  be  implemented  in  a  VLSI  systolic  array.  Programs 
7  and  8  avoid  this  calculation  by  retaining,  instead,  an  additional 
trio  of  polynomials  rfJ{x),  rO(x),  and  rT(x). 

Program  8  is  more  efficient  than  program  7  in  that  the  updates 

of  the  old  polynomials  rO(x),  aO(x),  bO(x)  in  lines  15-17  do 

not  require  a  multiplication.  However,  both  programs  may  be 

unsuitable  for  VLSI  implementation.  Program  7  usually  requires 

computation  of  a  finite  field  inverse  d_1  at  alternate  iterations, 

N  0 

while  program  8  requires  a  finite  field  division  rj/rj  at  every 
iteration.  Both  operations  are  considered  difficult  to  implement  in 
VLSI.  In  section  8.1  we  examine  Burton's  enhancement  of  the 
Berlekamp-Massey  algorithm.  This  modification  obviates  the  need  for 
computing  finite  field  inverses  or  performing  finite  field  division 
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within  the  Berlekamp-Massey  algorithm.  (Of  course,  a  division  is 
still  required  outside  the  algorithm  if  Forney's  formula  (16)  is 
used  to  calculate  the  error  magnitudes.) 


The  Eucliaean  decoding  algorithms  under  consideration  are 
represented  by  programs  4,  5,  11,  and  12.  It  is  clear  that  programs 
analogous  to  11  and  12  can  be  constructed  for  the  Japanese  algorithm 
of  program  4.  Both  programs  4  and  5  suffer  certain  deficiencies 
compared  to  programs  11  and  12  and  the  Berlekamp-Massey  programs: 
they  require  polynomial  division,  itself  an  iterative  algorithm;  in 
certain  situations  they  can  have  problems  with  termination;  and 
there  is  a  constant  irksome  need  to  determine  the  degrees  of 
polynomials  and  vary  action  accordingly. 


The  quotient  polynomials  q(x)  in  programs  4  and  5  are  usually, 
though  not  always,  linear.  On  the  average,  one  polynomial  division 
of  the  Euclidean  algorithm  is  equated  with  two  iterations  of  the 
Berlekamp-Massey  algorithm.  But  when  we  break  the  polynomial 
division  of  Mills'  algorithm  down  into  its  component  partial 
divisions  in  program  11,  the  number  of  iterations  becomes  2t  for 
both  algorithms.  The  difference  is  that  each  pair  of  iterations  in 
the  Berlekamp-Massey  algorithm  consists  of  two  nearly  identical 
steps,  whereas  each  pair  of  partial  divisions  In  the  Japanese  or 
Mills'  algorithms  consists  of  two  distinct  steps,  clearly  favoring 
the  former. 


Termination  in  programs  4  and  5  is  correctly  determined  If  the 
number  of  errors  does  not  exceed  t,  the  underlying  assumption. 
However,  in  the  Berlekamp-Massey  algorithm,  if  more  than  t  errors 
have  occurred,  the  length  *  of  the  shift-register  will  sometimes. 
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though  not  often,  exceed  t,  indicating  that  uncorrectable  errors 
have  occurred.  Clearly,  this  is  useful  information  which  may  be 
lost  in  programs  4  and  5.  Program  12,  however,  does  and  program  11 
may  retain  this  information.  In  program  11,  the  degree  of  A(x)  may 
have  to  be  tested,  for  0  in  this  program  is  not  a  shift-register 
length. 

All  of  these  programs  can  also  be  used  with  arbitrary 
(nonsyndrome)  sequences  outside  the  decoding  context.  However,  for 
programs  4  and  5,  there  is  no  certain  way,  with  an  arbitrary  input 
sequence,  of  knowing  when  to  halt  the  algorithm.  (The  other 
algorithms  are  terminated  correctly  by  defining  2t  to  be  the 
sequence  length.)  Consider  the  following  example. 

Example  18:  Let  GF(19)  be  generated  by  the  primitive  root  2.  Find 
shortest  length  LFSR's  to  generate  the  sequences 

Sj  :  14,  7,  12,  15,  7,  15,  12,  7,  14,  6 


s2  :  6,  14,  7,  12,  15,  7,  15,  12,  7,  14. 

The  Berlekamp-Massey  algorithm,  with  2t  =  10,  finds  the  solution 
\(x)  =  2x6  +  4x4  +  6x3  +  15x2  +  9x  +  1 


for  sequence  Si.  Programs  11  and  12,  with  2t  =  10,  find  scalar 
multiples  of  this  same  solution: 

14\(x)  =  9x6  +  18x4  +  8x3  +  x2  +  12x  +  14. 
and 

3  A ( x )  =  6x6  +  12x4  +  18x3  +  7x2  +  8x  +  3. 

However,  programs  4  and  5,  with  t  =  5,  terminate  too  soon  with  the 
polynomial 

b ( x )  =  15x4  +  2x3  +  16x2  +  2x  +  15. 

If  allowed  to  continue  for  one  more  iteration  (e.g.,  by  setting 
t  =  61  both  programs  find  a  correct  (though  different)  solution. 

When  the  reversal  input  sequence  s2  is  used,  both  programs  4 
and  5  terminate  correctly  if  t  is  chosen  to  be  5,  but  produce  an 
incorrect  result  if  t  is  set  equal  to  6.  Thus,  there  is  no  safe  way 
to  use  these  programs  with  an  arbitrary  input  sequence.  (The 
programs  terminate  correctly  if  the  sequence  is  repeated  once  and  t 
is  taken  to  be  its  original  length  10.)  Programs  8  and  12  have  no 
difficulty  with  sequence  s2. 

The  third  objection  to  programs  4  and  5  is  the  constant  need 
for  determining  the  degrees  of  the  polynomials  used  in  the 
algorithms,  and  for  varying  the  action  taken  accordingly.  Such  a 
determination  and  comparison  is  implicit  in  each  execution  of  (84). 


When  used  to  find  a  minimum  length  LFSR  which  generates  an 

arbitrary  sequence,  even  program  11  can  have  a  problem,  at 

termination,  in  determining  the  correct  degree  of  the  shift-register 
polynomial.  (Recall  that  ?  in  program  11  does  net  denote  the 
shift-register  length.)  A  correct  solution  for  the  sequence  s2  of 
example  18  is 

Mx)  =  Ox5  +  x4  +  9x3  +  15x2  +  9x  +  1. 

The  proper  shift-register  length  is  5,  not  4,  that  is,  the  last 

stage  must  be  included  even  though  it  is  not  tapped.  Program  11 
finds  a  correct  shift-register  connection  polynomial  (a  scalar 
multiple  of  \ ( x ) ) ,  but  is  unable  to  tell  the  correct  shift-regi ster 
length. 

We  conclude  that  for  three  substantial  reasons,  programs  4  and 
5  are  not  competitive  with  programs  7,  8,  and  12.  Program  11  also 
seems  to  be  out  of  contention;  there  is  no  reason  to  execute  two 
distinct  equally  complex  loops  instead  of  executing  one  of  them 
twice.  Thus,  we  are  left  with  comparing  programs  8  and  12. 

Between  these  two  programs  there  would  seem  to  be  no 
preference.  Both  have  the  flaw  that  a  finite  field  division  is 
required.  This  we  remove  in  section  8  by  applying  Burton's 
innovation.  Program  8  produces  a  monic  error  locator  polynomial 
Mx).  This  may  possibly  give  some  advantage  to  program  8,  depending 
on  the  method  chosen  to  complete  the  error  correction.  However,  as 
we  have  seen,  neither  the  Chien  search  nor  Forney's  formula  (16) 
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benefits  from  the  use  of  a  monic  error  locator  polynomial. 
Furthermore,  incorporation  of  Burton's  enhancement  eliminates  the 
monicity  advantage,  if  there  is  one. 

Table  3  summarizes  the  estimates  of  the  number  of 
multiplications  required  for  each  of  the  programs.  All  basically 
require  on  the  order  of  4t2  multiplications  to  find  a.{ x )  when  t 
errors  have  occurred.  Program  6  can  get  by  with  2t2  multiplications 
if  bO(x)  is  not  normalized,  but  does  not  provide  n ( x ) ,  and 
involves  an  unacceptable  delay  in  the  computation  of  the 
discrepancies.  Program  7  becomes  program  8  when  the  normalization 
of  b°(x),  a°(x),  and  rO(x)  is  omitted.  All  programs  except 
program  6  require  2t  basic  time  units  for  correction  of  t  errors, 
where  a  basic  time  unit  includes  the  time  required  for  a  finite 
field  division  (or  inversion),  a  multiplication,  and  a  subtraction. 

Table  3.  Number  of  Multiplications  Required 
for  Obtaining  \(x)  in  the  Presence  of  t  Errors 


Program 

Number  of  Multiplications 

Remarks 

4 

4t2 

5 

5t2 

1 

6 

2.5t2 

2 

7 

6t2 

8 

4t2 

11 

4t2 

12 

4t2 

1  can  be  reduced  to  4t2  by  dropping  terms  from  r(x) 

2  can  be  reduced  to  2t2  by  omitting  normalization  of  bO(x) 
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SECTION  8 


INVERSIONLESS  DECODING 

All  programs  which  have  been  considered  so  far  require 
inversion  of  finite  field  elements  or,  equivalently,  division  of 
finite  field  elements.  These  operations  are  generally  considered  to 
be  difficult  to  implement  efficiently  in  VLSI.  In  this  section  we 
consider  algorithm  modifications  which  avoid  explicit  computation  of 
inverses  ana  division  at  the  cost  of  further  multiplications.  In 
section  8.1  we  examine  Burton's  [27]  inversionless  variant  of  the 
Berlekamp-Massey  algorithm.  In  section  8.2  we  look  at  analogous 
methods  for  avoiding  the  computation  of  inverses  in  the  Euclidean 
decoding  algorithms. 

8.1  BURTON'S  ALGORITHM 

Burton  [27]  has  given  a  modification  to  Berlekamp's  algorithm 
which  eliminates  the  usual  inversion  of  the  discrepancy  d  at  each 
iteration,  or,  equivalently,  the  division  rj/r^  at  each 
iteration  in  Citron's  version  (program  8). 

At  line  11  of  the  recursion  in  program  8  we  want  to  compute  the 
new  shift  register  polynomial  by 

bT(x)  *  bN(x)  -  rrJN/rj°lb°(x)  (91) 


JX  VXVXVXV*  \.X IW IVIKW-Y, 


b°(x)  -  bN(x) 
a°(x)  -  aN(x) 
r°(x)  -  r"(x) 

(  -  i  -  i' 


RECURSION 


INPUT:  SYNDROME  POLYNOMIAL  S(x),  INTEGER  t 
OUTPUT:  ■>  a(x)  =  bN(x),  >  Q(x)  =  aN(x)/x 


Program  13.  BURTON'S  ALGORITHM 


1 

SI'S!' 


r°(x)  -  1 
rT(x)  -  *S(x) 
a°(x)  -  -1 
aT(x)  -  0 
b°(x)  -  0 
bT(x)  -  1 

INITIALIZATION 


rf  :  0 

bT(x)  -  rfbN(x)  -  rf*b°(x) 

aT(x)  -  rfaN(x)  -  ifa°(x) 

rT(x)  -  rfr^(x)  -  rfr°(x) 
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without  having  to  divide.  Multiplying  (91)  by  rj  gives 
r°bT(x)  =  r9bN(x)  -  r^b°(x). 

Burton  accepts  r^bT(x)  as  the  new  shift  register  polynomial  in 
place  of  bT(x)  as  defined  by  (91),  and  similarly  for  aT(x)  and 
rT(x),  giving  program  13.  At  the  termination  of  program  13  we  end 
up  with  a2t(x)/x  =  vQ(x)  and  b^x)  =  yA(x)  for  some  nonzero 
field  element  y.  This  is  acceptable,  since  neither  the  Chien  search 
for  the  error  locations  nor  Forney's  calculation  of  the  error 
magnitudes  is  affected.  Citron  [22]  has  also  used  Burton's 
modification  to  obtain  an  inversionless  algorithm  equivalent  to 
program  13. 

Example  19:  Reed-Solomon  3-error-correcting  code  over  GF (11)  with 

a  =  2. 


t  =  3;  Let  c(x)  =  0,  v(x)  =  e(x)  =  6x9  +  5x8  +  3x3 
S(x)  =  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9 


0,  t  =  0 


rN(x) 

r°(x) 

bN(x) 

bO(x) 

aN(x) 

a°(x) 


=  xS(x)  =  10x6  +  7x5  +  9x4  +  8x3  +  2x2  +  9x 


=  -x  =  lOx 
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j=4j£=2:r^=9,  r9=  2 

rN(x)  <-  2r,,i(x)  -  9r°(x)  =  8x8  +  6x7  +  3x6  +  6x3 

rO(x)  «-  xr^(x)  =  2x9  +  10xG  +  x7  +  10x6  +  2xb 

b^(x)  -  2bN(x)  -  9b°(x)  =  3x2  +  4x  +  8 

bO(x)  <-  xb^(x)  =  9x3  +  9x2 

aN( x )  «-  2aN(x)  -  9aC)(x)  =  8x2  +  6x 

a°(x)  <-  xa^(x)  =  4x3 

j=5,  t  s  2;  s  6,  r9=  2 
J  J 

rM(x)  *  2rN(x)  -  6r°(x)  =  10x9  +  Ox8  +  6x7  +  x6 
r°(x)  <-  xrN(x)  =  8x9  +  6xs  +  3x7  +  6x6 

bN(x)  «-  2bw(x)  -  6b0(x)  =  x3  +  7x2  +  8x  +  5 

bO(x)  xbN{x)  =  3x3  +  4x2  +  8x 
aN(x)  <-  2aN(x)  -  6a°(x)  =  9x3  +  5x2  +  x 
a°(x)  <-  xa,J(x)  =  8x3  +  6x2 


j  =  6 ,  o=3:  r. 

3 


r*(x) 
r°(x) 
bN(x) 
b<j(x) 
aN(  x) 
aO(x) 


♦  6rW(x)  -  r°(x)  =  8x9  +  5xb  +  Ox7 

♦  xr^(x)  =  8xiU  +  ox9  +  3x9  +  6x; 

♦  6bN(x)  -  bO(x)  =  3x3  +  5x2  +  7x  +  8 

<-  xb°(x)  =  3x4  +  4x3  +  8x2 

«-  6aN(x)  -  aO(x)  =  2x3  +  2x2  +  6x 

<-  xa°(x)  =  8x9  +  6x3 
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>  Wvvv'v' 


j  =  7,  stop,  v  =  8,  y  -  7 


\ ( x )  =  Y_1bN(x)  =  10x3  +  2x2  +  5x  +  1 
o(x)  =  Y_1aN(x)/x  =  3x2  +  3x  +  9 
x2tA(x)  =  y-M{x)/x  =  x8  +  2x7 
Mx)  =  x2  +  2x 

Burton's  modi f i cation  almost  doubles  the  number  of 
multiplications.  We  require  4t2  +  2t  multiplications  to  update 
rT(x),  2t2  +  2t  for  bT(x),  and  2t2  -  2t  +  2  for  aT(x)  for  a 
total  on  the  order  of  at2  multiplications. 

8.2  INVERSIONLESS  EUCLIDEAN  ALGORITHMS 


The  Japanese  decoding  algorithm  and  Mills'  algorithm  can  be  put 
into  inversionless  form  by  Burton's  technique  if  we  first  break  the 
polynomial  divisions  down  explicitly  into  their  partial  divisions 
as  was  done  in  program  11.  To  eliminate  the  finite  field  division 
in  the  completion  loop  of  program  11,  we  delete  statement  10  (which 
defined  q)  and  replace  statement  11  by 

bT(x)  -  A°(x)  -  r°bN(x)  (92) 

J  J 


etc.,  resulting  in  program  14.  Sugiyama,  et  al .  [12]  have  used 
Burton's  technique  to  yield  an  inversionless  variant  of  their 
algorithm  which  is  equivalent  to  program  14.  Program  14  requires  on 
the  order  of  8t2  multiplications.  Shao,  et  al .  [28]  have  also 
presented  an  inversionless  variant  of  the  Japanese  algorithm  which 
is  similar  to  program  14. 
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'  I.V^i 


— — 1 

1 

u 

j  - 

0 

r°(x)  - 

1 

rT(x)  - 

xS(x) 

a°(x)  - 

-1 

aT(x)  - 

0 

b°(x)  - 

0 

bT(x)  - 

1 

INITIALIZATION 


b°(x)  -  xb°(x) 
a°(x)  -  xa°(x) 
"  xr°(x) 
bN(x)  -  bT(x) 
aN(x)  -  aT(x) 
r*(x)  -  rT(x) 
i  -  i  +  1 


j  : 

2t 

?  : 

0 

bT(x)  - 

>ffc»0(x) 

-  rj>bN(x) 

aT(x)  - 

rfa°(x) 

-  f?aN(x) 

rT(x)  - 

rfV>(x) 

-  ifr^x) 

I  : 

2i 

b°(x)  - 

bN(x) 

a°(x)  - 

aN(x) 

»°w  - 

r*{x) 

COMPLETION  LOOP 

b°(x)  -  bT(x) 
a°(x)  -  aT(x) 
r°(x)  -  rT(x) 
bK(x)  -  xbN(x) 
aN(x)  -  xaN(x) 
r^x)  -  xrw(x) 

(’  -  C  +  1 

CONTINUATION  LOOP 


INPUT:  SYNDROME  POLYNOMIAL  S(x),  INTEGER  t 
OUTPUT:  i  X(x)  =  bN(x), }  fi{x)  =  aN(x)/x 


Program  14.  INVERSIONLESS  MILLS'  ALGORITHM 


Finally,  let  us  apply  Burton's  technique  to  the  simplified 
Mill's  decoder  of  program  12.  Analogous  to  the  changes  made  in 
program  8  to  obtain  program  13,  we  delete  statement  10  of  the 
recursion  which  defined  q,  and  modify  statements  11-13,  replacing 
statement  11  by  (92),  etc.,  as  in  program  14.  These  changes  produce 
program  15.  It  is  manifest  that  programs  13  and  15  are  equivalent, 
the  only  difference  being  the  signs  of  the  r.h.s.  of  statements 
10-12  of  the  recursion.  Example  20  shows  how  this  change  affects 
the  partial  results. 

Example  20:  Reea-Solomon  3-error-correcting  code  over  GF (11)  with 

a  =  2. 

t  =  3;  Let  c(x)  =  0,  v ( x )  =  e(x)  =  6x9  +  5x8  +  3x3. 

S( x )  =  10x5  +  7x4  +  9x3  +  8x2  +  2x  +  9. 

j  =  0,  *  =  0 

rN(x)  =  xS(x)  =  10x8  +  7x5  +  9x4  +  8x3  +  2x2  +  9x 

r°(x)  =  x 

bN(x)  =  1 

b°(x)  =  0 

af>(x)  =  0 

a°(x)  =  -x  =  lOx 

J  »  1,  1  -  0:  rj  =  9,  rj  ■  1 

rN(x)  «.  gr0(x)  -  rN(x)  =  x6  +  4x5  +  2x4  +  3x3  +  9x2 

r°(x)  <-  xrN(x)  =  10x7  +  7xb  +  9x8  +  8x4  +  2x3  +  9x2 

bN(x)  «■  9b0(x)  -  b*(x)  =  10 

b^(x)  *■  xbN(x)  =  x 

aN(x)  «-  9a^(x)  -  aN(x)  =  2x 

a^(x)  <-  xa^(x)  =  0 
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RECURSION 


INPUT:  SYNDROME  POLYNOMIAL  S(x),  INTEGER  t 
OUTPUT:  ■>  a(x)  =  bN(x),  >  fl(x)  =  aN(x)/x 

Figure  15.  BURTONIZED  MILLS'  ALGORITHM  -  FINAL  VERSION 
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VV’S 


k*  i.'i.'f  v»i  n|j|  f  1 1 


2,  A  «  1:  r?  -  9,  r?  «  9 

J  J 


rN(x) 
r°(x) 
bM(x) 
b°(  x) 
aN(x) 
a°(x) 


9r°(x) 
xr^(x) 
9b0(x) 
xbO(x) 
9a0(x) 
xa^f  x) 


-  9rN(x)  =  2x7  +  10x6  +  x5  +  10x4  +  2x3 


+  7x/  +  9x  +  8xb  +  2x4  +  9x; 


9b^(x)  =  9x  +  9 

*,2 


9a^(x)  =  4x 
0 


3.  *  ■  1:  rj  •  2,  rj  .  9 


rN(x)  <-  2r°(x) 
r0(x)  «-  xr^(x) 
bf,l(x)  *  2b°(x) 
b^(x)  <-  xb^(x) 
aN(x)  <-  2a°(x) 
a°(x)  *  xaN(x) 


9rN(x)  =  9x8  +  7x7  +  5xb  +  7x6  +  2x4 
2x8  +  10x7  +  x6  +  10x5  +  2x4 
9bN(x)  =  2x2  +  7x  +  7 
9x2  +  9x 
9aN(x)  =  8x 


4  5  =  2  ■  rv  =  2  r9  =  2 


r^(x) 

r°(x) 

bN(x) 

b°(x) 

aN(x) 

a°(x) 


2rO(x) 

xr°(x) 

2bO(x) 

xb°(x) 

2a°(x) 

xa°(x) 


2rN(x)  =  8x8  +  6x7  +  3x6  +  6x5 
2x9  +  10x8  +  x7  +  10x6  +  2xD 
2bN(x)  =  3x2  +  4x  +  8 


9x3  +  9x2 


-  2aw(x)  =  8x2  +  6x 


m 


r**J*  W- 
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s&S&S 


v'*  .*■ 
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SECTION  9 
DECODING  ERASURES 

In  this  section  earlier  results  are  extended  to  include  the 
decoding  of  erasures  in  addition  to  errors,  where  an  erasure  is  an 
error  whose  location  is  already  known  to  the  decoder.  A  BCH  t-error 
correcting  code,  with  minimum  distance  2t  +  1,  is  capable  of 
correcting  any  combination  of  v  errors  and  p.  erasures  for  which 
2v  +  u  <_  2t.  Forney  [9]  first  showed  that  by  employing  modified 
syndromes  one  can  still  solve  for  the  error  locator  polynomial  in 
the  presence  of  erasures.  Blahut  [29]  showed  that  the  errata 
locator  polynomial  (where  an  erratum  is  either  an  error  or  an 
erasure)  can  be  calculated  directly  (without  first  finding  the  error 
locator  polynomial)  by  initializing  Berlekamp's  algorithm  with  the 
erasure  locator  polynomial.  In  this  section  we  combine  these 
results  to  give  a  program  which  provides  both  the  errata  locator 
polynomial  and  the  errata  evaluator  polynomial.  Errata  magnitudes 
can  then  be  calculated  by  Forney's  formula  (16)  or  by  the  new 
formula  (24). 

As  in  section  3,  we  assume  a  BCH  code  designed  to  correct  t 
errors  in  a  codeword  of  length  n  =  qm  -  1  for  q  a  power  of  a 
prime.  Let  c(x)  represent  the  transmitted  codeword  polynomial  and 
e(x)  be  an  error  polynomial.  In  addition,  let  d(x)  represent  the 
channel  erasure  polynomial.  The  received  codeword  polynomial  is  now 

v ( x )  =  c(x)  +  d(x)  +  e(x). 
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We  define  2t  error  syndromes  by 


s.  =  v(aJ)  =  c(aJ)  +  d(  >  +  e(rrJ) 

J 


=  d(ar)  +  e ( or.  )  (j  =  1,  2t) 


where  a  is  a  primitive  element  of  GF ( q01 ) . 


Suppose  v  errors  and  u  erasures,  where  2v  +  ^  <_  2t,  have 
occurred  during  transmission.  We  define  v  unknown  error  locations 
Xp,  where  Xo  is  the  field  element  of  GF ( qm)  associated  with 
the  error  location,  and  v  unknown  error  magnitudes  Y^,  where 

Yo  >  0  and  Y*  e  GF(q).  In  addition,  we  now  define  u  known  era¬ 
sure  locations  W|<  z  &F(qm),  where  W|<  is  the  field  element 
associated  with  the  kth  erasure  location,  and  u  unknown  erasure 
magnitudes  e  GF(q).  The  W|<  are  always  assumed  to  be  distinct 
from  the  X9.  is  the  difference  between  the  transmitted 
symbol  at  location  W|<  and  the  symbol  assumed  for  the  kth  erasure 
at  the  receiving  end.  Unlike  Y$ ,  V|<  may  assume  the  value  0. 

The  2t  syndromes  are  now  given  by  the  2t  BCH  decoding  equations 


S,  =  e(^)  +  dUJ')  =  ?  Y  x]  +  V  V.  wj 
J  f.=l  '  *  k=l  K  K 


=  Ej  +  Dj ,  (j  =  1,  •••,  2t) 


The  error-and-erasure  decoding  problem  for  BCH  codes  is  to  solve 
this  set  of  2t  (nonlinear  simultaneous)  equations  for  the  v  unknown 
error  locations  Xf,  the  v  unknown  error  magnitudes  Y^,  and  the  n 
unknown  erasure  magnitudes  V^,  given  the  2t  syndromes  Sj  and  the 
u  erasure  locations  W^.  Forney's  solution  is  to  derive  from  the 
set  of  2t  equations  (93)  a  reduced  set  of  2t  -  u  equations  of  the 
form  (9)  which  can  be  solved  for  the  error  locator  polynomial  a(x). 

If  we  define  a ( x )  by  (10),  and  Ej  by 

V 

E.  =  V  yxj,  (j  =  1,  2t) 

J  *=1 

then  by  a  process  identical  to  that  which  obtained  equation  (11) 
from  (10)  in  section  3,  we  arrive  at 


Therefore, 


U  U  (1 

T  =  V  <  S.  .  =  y  < . ( E .  .  +  D .  . )  =  y  <.E.  .. 
3  i=o  1  J  1  i'=0  1  J-1  i=o  1  J-1 


We  now  multiply  (94)  by  and  sum  u  +  1  successive  equations, 
using  (97)  to  obtain  the  set  of  2t  -  u  equations 


ii  v 

o  =  y  (e-i  +  y  A  .£ .  .  1 

?.=0  A  1  J+v-*  i=l  1  J+v-i-AJ 

Tj+v  +  AiT j+v-i  *  U  =  u  +  1 . 2t). 


This  set  of  2t  -  u  equations  in  v  unknowns  Aj,  where  2v  _<  2t  -  u, 
is  exactly  analogous  to  the  set  (11),  and  can  be  solved  for  a(x)  by 
the  Peterson-Gorenstein-Zierler  algorithm  exactly  as  in  section  3. 

The  modified  syndrome  polynomial  is  defined  analogously  to  S(x) 


2t  .  . 

T(x)  =  y  T.x'3"1 
j=l  J 

=  |< (x)S(x) |  2t 


and  the  errata  locator  polynomial  n(x)  is  defined  as  the  product  of 
the  erasure  locator  polynomial  and  the  error  locator  polynomial: 


n(x)  =  <(x)a(x). 
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We  shall  re-use  q(x)  to  denote  the  errata  evalulator  polynomial, 
which  is  defined  by  the  key  equation  for  erasure-and-error  decoding: 


Q ( x )  =  n(x)S(x)  2t 


Any  of  the  programs  (see,  e.g.,  Berlekamp  [8],  pp.  229-231  and 
Sugiyama,  et  al .  [30])  supplied  in  sections  4-8  can  be  used  to 
decode  both  erasures  and  errors,  yielding  a(x)  and  ^(x),  if  we  first 
replace  S(x)  by  T(x),  as  computed  by  (99).  The  error  locations  can 
be  determined  by  applying  a  Chien  search  either  to  A(x)  or  to  T(x). 
T?(x)  can  be  obtained  from  \(x)  and  <(x).  Forney’s  formula  (16)  now 
becomes 

PUT1) 

Y  = - L-  (102) 

J  n'UT1) 

J 

where  Yj  and  Xj  are  now  interpreted  as  errata  magnitudes  and 
locations,  and  j  runs  from  1  to  v  +  i». 

However,  Blahut  [29]  has  pointed  out  that  it  is  unnecessary  to 
obtain  /v(x).  If,  in  Berlekamp's  algorithm,  the  shift-register 
connection  polynomial  b(x)  is  initialized  by  the  erasure  locator 
polynomial  k(x),  then  at  termination  this  polynomial  will  yield  n(x) 
in  place  of  /v(x).  (In  program  6  we  Initialize  i  and  j  by  the  number 
of  erasures  u  and  y(x)  by  <(x);  the  length  test  (line  8  of  the 
recursion)  is  modified  to  "j  +  u  :  2®";  the  length  specification 
(line  10)  is  changed  to  «-  j  -  i  +  u";  and  the  modified  syndromes 
Tj  are  used  in  place  of  the  syndromes  Sj  at  line  5  of  the 
recursion.) 
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If  we  use  one  of  the  Euclideanized  versions  of  the  Berlekamp- 
Massey  algorithm,  we  directly  obtain  the  errata  evaluator  polynomial 
q ( x )  as  well  as  ntx).  We  show  this  in  program  16,  which  is  program 
13  (Burton's  algorithm)  modified  for  handling  erasures,  and  with 
a(x)  superimposed  on  r(x).  In  this  program  the  integer  u  represents 
the  number  of  erasures.  If  u  =  0,  the  program  functions  like 
program  13.  There  is  no  confusion  in  superimposing  aN ( x )  and 
rN(x),  since  at  iteration  j,  r^  =  0  for  i  <  j,  »  <  j,  and  by 

l.i  M 

(66)  a{  =  0  for  i  i.  When  u  >  0,  n  niay  be  nonzero  for  i  <  p 
at  iteration  j  >  u,  but  aj  =  0,  so  there  is  no  problem  in 
determining  rj. 

There  is  a  problem  with  r°(x),  however.  Note  that  r°(x)  is 
now  initialized  by  0,  tne  sum  of  the  initializations  for  a°(x)  and 
r°(x)  in  program  13,  but  we  still  need  r&(x)  =  1  for  the  initial 
update  of  rT(x).  Therefore,  following  Burton  [27],  we  retain  an 
additional  variable  5  to  represent  the  old  discrepancy  value.  This 
is  initialized  as  1  and  updated  as  the  current  discrepancy  d  at 
every  length  change.  The  variable  6  is  not  needed  if  a(x)  and  r(x) 
are  not  superimposed. 

At  termination  we  have 

rN(x)  =  aN(x)(-l)  +  bN(x)xS(x). 

For  bN( x )  to  give  vnU),  we  must  have,  by  (101), 

yQ(x)  =  |(aN(x)  +  rN(x) )/x|  n 

thus  providing  the  motivation  for  superimposing  the  two  polynomials. 

161 


AvtVrw: 


>  J  -fa*  -I  i1 


li'.h*  ti1 


r°(x)  -  0 
b°(x)  -  0 
rT(x)  -  xS{x) 
bT(x)  -  1 


i  *-  i  + 1 

rT(x)  -  (1  -  WjX)  rT(x) 
bT(x)  -  (1  -  W,x)  bT(x) 

j  :  n _ 

INITIALIZATION 


r°(x)  -  xr°(x) 
b°(x)  -  xb°(x) 
r**(x)  -  rT(x) 
bN(x)  -  bT(x) 
j  -  j  +  1 
j  :  2t 
d  -  if 
d  :  0 

rT(x)  -  ir^x)  -  dr°(x) 
bT(x)  -  5bN(x)  -  db°(x) 
i  +  M  :  2 1 
r°(x)  -  f"(x) 
b°(x)  -  bN(x) 

&  -  d 

f  -  i  - 1  +  m 

RECURSION 


INPUT:  SYNOROME  POLYNOMIAL  S(x),  ERASURE  LOCATIONS  W,  (FIELD  ELEMENTS) 
NUMBER  OF  ERASURES  *  INTEGER  t 

OUTPUT:  >n(x)  =  bN(x),  >fl(x)  =  ^(xj/xl  „ 


Program  16.  DECODING  WITH  ERASURES 


« *** *  -  ’V *\  * •% 


A 


KjSK: 

M 


TrrjWVJVJ1^.'  v»  V.  j-j-mjn- »:  1TJ  w*  r  d  «  J 


If  A(x)  is  defined  analogously  to  (19)  by 

A(x)  =  L^UJSUD/x^J  (103) 

then  at  termination  of  program  16  we  have  A(x)  given  by 
[((aN(x)  +  rN(x))/x)/x2tJ ,  which,  since  a(x)  and  r(x)  are 
superimposed  in  program  16  is  simply  LrfJ ( x ) /x^t+lj  .  Error 
magnitudes  can  then  be  calculated  by  a  revised  version  of  (24): 


A(  xj ) 

Y  = - .  X.n"fl  (104) 

J  n'(x,)  J 

J 

where  n  is  the  codeword  length  and  d  is  the  designed  distance  of  the 
code. 


For  an  example  we  again  call  upon  the  Reed-Sol omon 
3-error-correcting  code  over  GF(ll)  with  a  =  2. 


! 
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v*- 

•JVi 


•rn,*r  •  w  n*m  i 


0  1 

0 

1  2 

0 

2  4 

0 

3  8 

3 

4  5 

9 

5  10 

9 

b  9 

10 

7  7 

0 

8  3 

8 

9  6 

4 

locations: 

a7,  a2,  a1 

locations: 

.  a3 ,  a8 ,  a9 

8  +  10  +  2  +  4 


3  +  10  +  9  + 


8  +  1  +  3  +  5 


+  10 


V0‘/S'  , 


v->: 


VvV.V 


1 . 


b)  evaluated  by  the  new  formula  (24): 


H(x)  =  4x3  +  7x  +  6 

n(x)  =  x4  +  4x3  +  8x2  +  8x  +  1 

fl1  (x)  «  4x3  +  x2  +  5x  +  8 
n  -  d  =  (q-1)  -  (2t+l)  =  3 

A(cr°)  ^  4  +  7  +  6  6 

n*(rt°)  4  +  1  +  5  +  8  7 

A(  rr 3 )  2  +  1  +  6  9 

Y2  =  -  - o;3  *  3  = - •  6  = - *6  =  3 

*'(*3)  2  +  9  +  7  +  8  4 


A(  o;8 )  9  +  10  +  6  3 

Y ^  =  ”  —  - ■  -- ^  =  ...  — w» —  •  5  =  —  '  ■  •  5  =  5 

i'(a.8)  9  +  9  +  4  +  8  8 

A(a9)  6  +  9  +  6  10 

Y4  = - a9’3  = - •  7  =  —  *7  =  6 

n’(a9)  6+3+B+8  3 

The  loop  in  the  initialization  box  of  program  16,  adapted  from 
Blahut  [29],  simultaneously  computes  the  erasure  locator  polynomial 
and  the  modified  syndrome  polynomial  for  initializing  bT(x)  and 
rT(x).  It  is  apparent  that  this  loop,  or  its  equivalent,  can  be 
added  to  the  initialization  box  of  any  of  the  decoding  programs 
discussed  earlier  for  converting  them  for  the  handling  of  erasures. 


Thus,  we  can  appropriately  modify  any  of  these  programs  to  yield 
both  n(x)  and  n(x)  for  error-and-erasure  decoding  of  BCH  codes.  As 
an  example  we  give  program  17,  which  is  an  adaptation  of  program  4 
for  erasure  decoding. 

In  program  17  f(x)  is  defined  as  x2t  and  g(x)  as  S(x),  so 
that  at  each  iteration  k  we  have  the  relation 


|rk(x)|  2t  =  J bk { x ) S( x ) 


2t 


holding.  In  particular,  for  k  =  -1,  rk(x)  =  x2t  and 
bk(x)  =  U  and  for  k  =  0,  rk(x)  =  S(x)  and  bk(x)  =  1.  At  the 
conclusion  of  the  initialization  loop  rk(x)  =  T(x)  and  bk(x)  = 
k(x).  At  termination  of  the  program  rk(x)  =  Q(x)  and  bk(x)  = 

Htx),  as  desired. 

Example  22:  Reed-Solomon  3-error-correcting  code  over  GF(ll)  with 
a  =  2 

t  =  3;  Let  c(x)  =  0,  e(x)  =  3x3  +  7,  d(x)  =  6x9  +  5x8. 
v ( x)  =  c(x)  +  d(x)  +  e(x)  =  6x9  +  5x8  +  3x3  +  7. 

S(x)  =  6x5  +  3x9  +  5x3  +  4x2  +  9x  +  5 


j  =  0:  r°(x)  =  x2t,  b°(x)  =  0 

rN{x)  =  S(x)  =  6x5  +  3x9  +  5x3  +  4x2  +  9x  +  5 
b^( x)  =  1 
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SECTION  10 


CONCLUSION 


The  decoding  algorithms  of  Sugiyama,  Kasahara,  Hirasawa,  and 
Namekawa,  of  Mills)  and  the  Berlekamp-Massey  algorithm  have  been 
reviewed  and  compared.  All  can  be  viewed  as  variants  of  Euclid's 
algorithm.  Various  enhancements  of  these  algorithms  have  been 
considered,  including  modifications  which  avoid  the  computation  of 
finite  field  inverses,  and  which  permit  decoding  of  erasures  in 
addition  to  errors. 

The  Japanese  algorithm  and  Mills'  algorithm  are  based  on  a 
direct  application  of  Euclid's  algorithm  to  solve  the  key  equation 
(14)  for  BCH  decoding.  We  have  seen  that  when  the  polynomial 
divisions  contained  in  these  algorithms  are  broken  down  into  their 
individual  partial  divisions  the  result  is  a  two-loop  structure 
depending  on  whether  a  polynomial  division  is  or  is  not  being 
completed.  These  decoding  algorithms,  therefore,  appear  to  be  at  a 
disadvantage  compared  to  the  single-loop  Berlekamp-Massey  algorithm. 

Treating  the  Berlekamp-Massey  algorithm  in  a  Euclidean  context 
yields  the  error  (or  errata)  evaluator  polynomial  in  addition  to  the 
locator  polynomial  and  obviates  the  need  to  perform  a  vector  Inner- 
product  calculation  for  computing  the  discrepancies.  In  this  form 
the  algorithm  appears  to  be  well-suited  for  VLSI  implementation  in  a 
systolic  array.  This  implementation  will  be  the  subject  of  further 
investigation. 
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RADC  plans  and  executes  research,  development,  test  and 
selected  acquisition  programs  in  support  of  Command,  Control, 
Communications  and  Intelligence  (C5I)  activities.  Technical  and 
engineering  support  within  areas  of  competence  is  provided  to 
ESD  Program  Offices  (POs)  and  other  ESD  elements  to 
perform  effective  acquisition  of  C3I  systems.  The  areas  of 
technical  competence  include  communications,  command  and 
control,  battle  management  information  processing,  surveillance 
sensors,  intelligence  data  collection  and  handling,  solid  state 
sciences,  electromagnetics,  and  propagation,  and  electronic 
reliability /maintainability  and  compatibility. 


